All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] btrfs-progs repair support for unaligned/mismatched device sizes
@ 2017-10-10  7:51 Qu Wenruo
  2017-10-10  7:51 ` [PATCH 1/4] btrfs-progs: Introduce functions to repair unaligned/mismatch device size Qu Wenruo
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Qu Wenruo @ 2017-10-10  7:51 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba, yoasif, rrauenza

The patchset can be fetched from github:
https://github.com/adam900710/btrfs-progs/tree/check_unaligned_dev

There are several reports in mail list for btrfs device size related
problems.

1) Unmountable fs, due to mismatched super total_bytes
   Unmountable if super total_bytes is smaller than total rw bytes of
   all devices.
   Root cause under investigation, but only one report here.

   This patchset provides the tool to fix it offline.
   (At least better than unmountable forever)

2) Harmless kernel warning for btrfs_update_device()
   v4.14 introduced restrict device size checker.
   This somewhat break the backward compatibility and causing kernel
   warning.

   It can be fixed online with "btrfs filesystem resize".
   (Although it is better to fixed it at mount time)

   This patchset also provide a fallback method to fix it.

Qu Wenruo (4):
  btrfs-progs: Introduce functions to repair unaligned/mismatch device
    size
  btrfs-progs: fsck: Introduce --fix-dev-size option
  btrfs-progs: check: Also check unalignment/mismatch device and super
    size
  btrfs-progs: test/fsck: Add test case image for --fix-dev-size

Qu Wenruo (4):
  btrfs-progs: Introduce functions to repair unaligned/mismatch device
    size
  btrfs-progs: fsck: Introduce --fix-dev-size option
  btrfs-progs: check: Also check unalignment/mismatch device and super
    size
  btrfs-progs: test/fsck: Add test case image for --fix-dev-size

 Documentation/btrfs-check.asciidoc                 |  23 ++
 cmds-check.c                                       | 292 ++++++++++++++++++++-
 .../dev_and_super_mismatch_unaligned.raw.xz        | Bin 0 -> 21536 bytes
 3 files changed, 314 insertions(+), 1 deletion(-)
 create mode 100644 tests/fsck-tests/027-unaligned-super-dev-sizes/dev_and_super_mismatch_unaligned.raw.xz

-- 
2.14.2


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/4] btrfs-progs: Introduce functions to repair unaligned/mismatch device size
  2017-10-10  7:51 [PATCH 0/4] btrfs-progs repair support for unaligned/mismatched device sizes Qu Wenruo
@ 2017-10-10  7:51 ` Qu Wenruo
  2017-10-10  8:24   ` Nikolay Borisov
  2017-10-10  7:51 ` [PATCH 2/4] btrfs-progs: fsck: Introduce --fix-dev-size option Qu Wenruo
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Qu Wenruo @ 2017-10-10  7:51 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba, yoasif, rrauenza

This patch introduces functions to repair device size related problems,
including:
1) Unaligned total_bytes of dev_item
   v4.14-rc kernel introduced total_bytes alignment checker.
   However older kernel device add/shrink doesn't align these members.
   This will cause kernel warning every time dev_item get updated.

   Although it can be fixed by shrinking device on latest kernel or
   use manually aligned size on older kernel, a fallback method in
   btrfs-progs won't hurt.

2) Mismatch super->total_bytes
   There are some reports about unmountable fs, due to mismatched
   super->total_bytes.
   And normal kernel device shrink won't help since it only modify the
   total_bytes by the size changed, not re-calculating it.

   The root cause is still under investigation, but at least to fix it
   in btrfs-progs

Fix all of them by manually rounding down total_bytes of each device, and
recalculate super->total_bytes using all existing devices.

Reported-by: Asif Youssuff <yoasif@gmail.com>
Reported-by: Rich Rauenzahn <rrauenza@gmail.com>
Signed-off-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
---
 cmds-check.c | 191 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 191 insertions(+)

diff --git a/cmds-check.c b/cmds-check.c
index 0c08618ed701..007781fa5d1b 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -12885,6 +12885,197 @@ close_out:
 	return ret;
 }
 
+/*
+ * Return 0 if DEV_ITEM for @device is good
+ * Return >0 if DEV_ITEM for @device has unaligned value and fixed
+ * Return <0 if we failed to fix the unaligned value
+ */
+static int reset_one_dev_size(struct btrfs_fs_info *fs_info,
+			       struct btrfs_device *device)
+{
+	struct btrfs_trans_handle *trans;
+	struct btrfs_key key;
+	struct btrfs_path path;
+	struct btrfs_root *chunk_root = fs_info->chunk_root;
+	struct btrfs_dev_item *di;
+	u64 old_bytes = device->total_bytes;
+	int ret;
+
+	if (IS_ALIGNED(device->total_bytes, fs_info->sectorsize))
+		return 0;
+
+	/* Align the in-memory total_bytes first, and use it later */
+	device->total_bytes = round_down(device->total_bytes,
+					 fs_info->sectorsize);
+
+	key.objectid = BTRFS_DEV_ITEMS_OBJECTID;
+	key.type = BTRFS_DEV_ITEM_KEY;
+	key.offset = device->devid;
+
+	trans = btrfs_start_transaction(chunk_root, 1);
+	if (IS_ERR(trans)) {
+		ret = PTR_ERR(trans);
+		error("error starting transaction:  %d (%s)",
+		      ret, strerror(-ret));
+		return ret;
+	}
+
+	btrfs_init_path(&path);
+	ret = btrfs_search_slot(trans, chunk_root, &key, &path, 0, 1);
+	if (ret > 0) {
+		error("failed to find DEV_ITEM for devid %llu",
+			device->devid);
+		ret = -ENOENT;
+		goto err;
+	}
+	if (ret < 0) {
+		error("failed to search chunk root: %d (%s)",
+			ret, strerror(-ret));
+		goto err;
+	}
+	di = btrfs_item_ptr(path.nodes[0], path.slots[0],
+			    struct btrfs_dev_item);
+	btrfs_set_device_total_bytes(path.nodes[0], di, device->total_bytes);
+	btrfs_mark_buffer_dirty(path.nodes[0]);
+	ret = btrfs_commit_transaction(trans, chunk_root);
+	if (ret < 0) {
+		error("failed to commit current transaction: %d (%s)",
+			ret, strerror(-ret));
+		btrfs_release_path(&path);
+		return ret;
+	}
+	btrfs_release_path(&path);
+	printf("Fixed device size for devid %llu, old size: %llu new size: %llu\n",
+		device->devid, old_bytes, device->total_bytes);
+	return 1;
+
+err:
+	/* We haven't modified anything, it's OK to commit current trans */
+	btrfs_commit_transaction(trans, chunk_root);
+	btrfs_release_path(&path);
+	return ret;
+}
+
+/*
+ * Return 0 if super block total bytes matches with device total_bytes
+ * Return >0 if super block has mismatch total_bytes and fixed
+ * Return <0 if we failed to fix the mismatch total_bytes
+ */
+static int reset_super_total_bytes(struct btrfs_fs_info *fs_info)
+{
+	struct btrfs_trans_handle *trans;
+	struct btrfs_device *device;
+	struct list_head *dev_list = &fs_info->fs_devices->devices;
+	u64 total_bytes = 0;
+	u64 old_bytes = btrfs_super_total_bytes(fs_info->super_copy);
+	int ret;
+
+	list_for_each_entry(device, dev_list, dev_list)
+		total_bytes += device->total_bytes;
+
+	if (total_bytes == old_bytes)
+		return 0;
+
+	/* Just in case */
+	if (!IS_ALIGNED(total_bytes, fs_info->sectorsize)) {
+		error("final total_bytes still not aligned to %u, please report a bug to btrfs mail list",
+			fs_info->sectorsize);
+		return -EUCLEAN;
+	}
+
+	btrfs_set_super_total_bytes(fs_info->super_copy, total_bytes);
+
+	/* Commit transaction to update all super blocks */
+	trans = btrfs_start_transaction(fs_info->tree_root, 1);
+	if (IS_ERR(trans)) {
+		ret = PTR_ERR(trans);
+		error("error starting transaction:  %d (%s)",
+		      ret, strerror(-ret));
+		return ret;
+	}
+	ret = btrfs_commit_transaction(trans, fs_info->tree_root);
+	if (ret < 0) {
+		error("failed to commit current transaction: %d (%s)",
+			ret, strerror(-ret));
+		return ret;
+	}
+	printf("Fixed super total bytes, old size: %llu new size: %llu\n",
+		old_bytes, total_bytes);
+	return 1;
+}
+
+/*
+ * Repair device size related problems, including:
+ * 1) Unaligned total_bytes of dev_item
+ *    v4.14-rc kernel introduced total_bytes alignment checker.
+ *    However older kernel device add/shrink doesn't restrictly align
+ *    these members.
+ *    This will cause kernel warning everytime dev_item get updated.
+ *
+ *    Although it can be fixed by shrinking device on latest kernel or
+ *    use manually aligned size on older kernel, a fallback method in
+ *    btrfs-progs won't hurt.
+ *
+ * 2) Mismatch super->total_bytes
+ *    Due to similar situation, super->total_bytes can be mismatch with
+ *    total size of all devices.
+ *    This will cause the filesystem unable to be mounted with v4.14-rc kernel.
+ *
+ *    Add such fixer in btrfs-progs as a fallback method.
+ *
+ * Fix all of them by manually rounding down total_bytes of each device, and
+ * recalculate super->total_bytes using all existing devices.
+ */
+static int reset_devs_size(struct btrfs_fs_info *fs_info)
+{
+	struct btrfs_device *device;
+	struct list_head *dev_list = &fs_info->fs_devices->devices;
+	bool have_bad_value = false;
+	int ret;
+
+	/* Seed device is not support yet */
+	if (fs_info->fs_devices->seed) {
+		error("resetting device size with seed device is not supported yet");
+		return -ENOTTY;
+	}
+
+	/* All devices must be on-line before reparing */
+	if (list_empty(dev_list)) {
+		error("no device found");
+		return -ENODEV;
+	}
+	list_for_each_entry(device, dev_list, dev_list) {
+		if (device->fd == -1 || !device->writeable) {
+			error("device with devid %llu is missing or not writeable",
+			      device->devid);
+			error("resetting device size needs all device(s) present and writeable");
+			return -ENODEV;
+		}
+	}
+
+	/* Repair total_bytes of each device */
+	list_for_each_entry(device, dev_list, dev_list) {
+		ret = reset_one_dev_size(fs_info, device);
+		if (ret < 0)
+			return ret;
+		if (ret > 0)
+			have_bad_value = true;
+	}
+
+	/* Repair super total_byte */
+	ret = reset_super_total_bytes(fs_info);
+	if (ret > 0)
+		have_bad_value = true;
+	if (have_bad_value) {
+		printf("Fixed unaligned/mismatch total_bytes for superblock and device item\n");
+		ret = 1;
+	} else {
+		printf("No device size related problem found\n");
+		ret = 0;
+	}
+	return ret;
+}
+
 const char * const cmd_check_usage[] = {
 	"btrfs check [options] <device>",
 	"Check structural integrity of a filesystem (unmounted).",
-- 
2.14.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/4] btrfs-progs: fsck: Introduce --fix-dev-size option
  2017-10-10  7:51 [PATCH 0/4] btrfs-progs repair support for unaligned/mismatched device sizes Qu Wenruo
  2017-10-10  7:51 ` [PATCH 1/4] btrfs-progs: Introduce functions to repair unaligned/mismatch device size Qu Wenruo
@ 2017-10-10  7:51 ` Qu Wenruo
  2017-10-10 13:16   ` David Sterba
  2017-10-10  7:51 ` [PATCH 3/4] btrfs-progs: check: Also check unalignment/mismatch device and super size Qu Wenruo
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Qu Wenruo @ 2017-10-10  7:51 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba, yoasif, rrauenza

Introduce --fix-dev-size option to fix device related problems.

This repairing  is also included in --repair, but considering the safety
and potential affected users, it's better to provide a option to fix and
only fix device size related problem to avoid full --repair.

Reported-by: Asif Youssuff <yoasif@gmail.com>
Reported-by: Rich Rauenzahn <rrauenza@gmail.com>
Signed-off-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
---
 Documentation/btrfs-check.asciidoc | 23 +++++++++++++++++++++++
 cmds-check.c                       | 28 +++++++++++++++++++++++++++-
 2 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/Documentation/btrfs-check.asciidoc b/Documentation/btrfs-check.asciidoc
index fbf48847ac25..e45c7a457bac 100644
--- a/Documentation/btrfs-check.asciidoc
+++ b/Documentation/btrfs-check.asciidoc
@@ -93,6 +93,29 @@ the entire free space cache. This option with 'v2' provides an alternative
 method of clearing the free space cache that doesn't require mounting the
 filesystem.
 
+--fix-dev-size::
+From v4.14-rc kernels, a more restrict device size checker is introduced, while
+old kernel doesn't strictly align its device size, so it may cause noisy kernel
+warning for newer kernel, like:
++
+....
+WARNING: CPU: 3 PID: 439 at fs/btrfs/ctree.h:1559 btrfs_update_device+0x1c5/0x1d0 [btrfs]
+....
++
+And for some case where super block total device size may mismatch with all
+devices, and the filesystem will be unable to be mounted, with kernel message
+like:
++
+....
+BTRFS error (device sdb): super_total_bytes 92017859088384 mismatch with fs_devices total_rw_bytes 92017859094528 
+....
++
+This option will fix both problems by aligning all size of devices, and
+re-calculating superblock total bytes.
++
+Although such repairing is included in *--repair* option, considering the
+safety of *--repair*, this option is provided to suppress all other dangerous
+repairing and only fix device sizes related problems.
 
 DANGEROUS OPTIONS
 -----------------
diff --git a/cmds-check.c b/cmds-check.c
index 007781fa5d1b..fdb6d832eee1 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -11746,6 +11746,8 @@ out:
 	return err;
 }
 
+static int reset_devs_size(struct btrfs_fs_info *fs_info);
+
 static int do_check_chunks_and_extents(struct btrfs_fs_info *fs_info)
 {
 	int ret;
@@ -11756,6 +11758,12 @@ static int do_check_chunks_and_extents(struct btrfs_fs_info *fs_info)
 		ret = check_chunks_and_extents_v2(fs_info);
 	else
 		ret = check_chunks_and_extents(fs_info);
+	/* Also repair device sizes if needed */
+	if (repair && !ret) {
+		ret = reset_devs_size(fs_info);
+		if (ret > 0)
+			ret = 0;
+	}
 
 	return ret;
 }
@@ -13088,6 +13096,8 @@ const char * const cmd_check_usage[] = {
 	"-b|--backup                 use the first valid backup root copy",
 	"--force                     skip mount checks, repair is not possible",
 	"--repair                    try to repair the filesystem",
+	"--fix-dev-size              repair device size related problem",
+	"                            will not trigger other repair",
 	"--readonly                  run in read-only mode (default)",
 	"--init-csum-tree            create a new CRC tree",
 	"--init-extent-tree          create a new extent tree",
@@ -13128,6 +13138,7 @@ int cmd_check(int argc, char **argv)
 	int qgroups_repaired = 0;
 	unsigned ctree_flags = OPEN_CTREE_EXCLUSIVE;
 	int force = 0;
+	bool fix_dev_size = false;
 
 	while(1) {
 		int c;
@@ -13135,7 +13146,7 @@ int cmd_check(int argc, char **argv)
 			GETOPT_VAL_INIT_EXTENT, GETOPT_VAL_CHECK_CSUM,
 			GETOPT_VAL_READONLY, GETOPT_VAL_CHUNK_TREE,
 			GETOPT_VAL_MODE, GETOPT_VAL_CLEAR_SPACE_CACHE,
-			GETOPT_VAL_FORCE };
+			GETOPT_VAL_FORCE, GETOPT_VAL_FIX_DEV_SIZE };
 		static const struct option long_options[] = {
 			{ "super", required_argument, NULL, 's' },
 			{ "repair", no_argument, NULL, GETOPT_VAL_REPAIR },
@@ -13158,6 +13169,8 @@ int cmd_check(int argc, char **argv)
 			{ "clear-space-cache", required_argument, NULL,
 				GETOPT_VAL_CLEAR_SPACE_CACHE},
 			{ "force", no_argument, NULL, GETOPT_VAL_FORCE },
+			{ "fix-dev-size", no_argument, NULL,
+				GETOPT_VAL_FIX_DEV_SIZE },
 			{ NULL, 0, NULL, 0}
 		};
 
@@ -13245,6 +13258,11 @@ int cmd_check(int argc, char **argv)
 			case GETOPT_VAL_FORCE:
 				force = 1;
 				break;
+			case GETOPT_VAL_FIX_DEV_SIZE:
+				fix_dev_size = true;
+				repair = 1;
+				ctree_flags |= OPEN_CTREE_WRITES;
+				break;
 		}
 	}
 
@@ -13371,6 +13389,14 @@ int cmd_check(int argc, char **argv)
 			report_qgroups(1);
 		goto close_out;
 	}
+
+	if (fix_dev_size) {
+		ret = reset_devs_size(info);
+		if (ret > 0)
+			ret = 0;
+		err |= !!ret;
+		goto close_out;
+	}
 	if (subvolid) {
 		printf("Print extent state for subvolume %llu on %s\nUUID: %s\n",
 		       subvolid, argv[optind], uuidbuf);
-- 
2.14.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/4] btrfs-progs: check: Also check unalignment/mismatch device and super size
  2017-10-10  7:51 [PATCH 0/4] btrfs-progs repair support for unaligned/mismatched device sizes Qu Wenruo
  2017-10-10  7:51 ` [PATCH 1/4] btrfs-progs: Introduce functions to repair unaligned/mismatch device size Qu Wenruo
  2017-10-10  7:51 ` [PATCH 2/4] btrfs-progs: fsck: Introduce --fix-dev-size option Qu Wenruo
@ 2017-10-10  7:51 ` Qu Wenruo
  2017-10-10  8:31   ` Nikolay Borisov
  2017-10-10  7:51 ` [PATCH 4/4] btrfs-progs: test/fsck: Add test case image for --fix-dev-size Qu Wenruo
  2017-10-10  8:15 ` [PATCH 0/4] btrfs-progs repair support for unaligned/mismatched device sizes Nikolay Borisov
  4 siblings, 1 reply; 14+ messages in thread
From: Qu Wenruo @ 2017-10-10  7:51 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba, yoasif, rrauenza

Along with the fix introduced, also introduce check for them.
Unlike normal check funtions, some of the check is optional, and even if
the image failed to pass optional check, kernel can still run fine.
(But may cause noisy kernel warning)

So some check, mainly for alignment, will not cause btrfs check to fail,
but only to output warning and how to fix it.

Signed-off-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
---
 cmds-check.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 73 insertions(+)

diff --git a/cmds-check.c b/cmds-check.c
index fdb6d832eee1..23dd3b51102b 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -9879,6 +9879,67 @@ static int check_device_used(struct device_record *dev_rec,
 	}
 }
 
+/*
+ * Extra (optional) check for dev_item size
+ *
+ * To avoid possible kernel warning for v4.14 kernel.
+ * It's not a deadly problem, just to info v4.14 kernel user.
+ * So it won't change the return value.
+ */
+static void check_dev_size_alignment(u64 devid, u64 total_bytes,
+				     u32 sectorsize)
+{
+	if (!IS_ALIGNED(total_bytes, sectorsize)) {
+		warning("unaligned total_bytes detected for devid %llu, have %llu should be aligned to %u",
+			devid, total_bytes, sectorsize);
+		warning("this is OK for older kernel (<4.14), but may cause kernel warning for newer kernel (>=4.14)");
+		warning("this can be fixed by 'btrfs check --fix-dev-size'");
+	}
+}
+
+/*
+ * Unlike device size alignment check above, some super total_bytes check
+ * failure can lead to unmountable fs for kernel >=v4.6.
+ *
+ * So this function will return the error for fatal super total_bytes problem.
+ */
+static int check_super_size(struct btrfs_fs_info *fs_info)
+{
+	struct btrfs_device *dev;
+	struct list_head *dev_list = &fs_info->fs_devices->devices;
+	u64 total_bytes = 0;
+	u64 super_bytes = btrfs_super_total_bytes(fs_info->super_copy);
+
+	list_for_each_entry(dev, dev_list, dev_list)
+		total_bytes += dev->total_bytes;
+
+	/* Important check, which can cause unmountable fs */
+	if (super_bytes < total_bytes) {
+		error("super total bytes %llu smaller than real devices size %llu",
+			super_bytes, total_bytes);
+		error("this fs will not be mountable for newer kernels (>=v4.6)");
+		error("this can be fixed by 'btrfs check --fix-dev-size'");
+		return 1;
+	}
+
+	/*
+	 * Optional check, just to make everything aligned and match with
+	 * each other.
+	 *
+	 * For btrfs-image restored fs, we don't need to check it anyway.
+	 */
+	if (btrfs_super_flags(fs_info->super_copy) &
+	    (BTRFS_SUPER_FLAG_METADUMP | BTRFS_SUPER_FLAG_METADUMP_V2))
+		return 0;
+	if (!IS_ALIGNED(super_bytes, fs_info->sectorsize) ||
+	    !IS_ALIGNED(total_bytes, fs_info->sectorsize) ||
+	    super_bytes != total_bytes) {
+		warning("minor unaligned/mismatch device size detected");
+		warning("recommended to use 'btrfs check --fix-dev-size' to fix it");
+	}
+	return 0;
+}
+
 /* check btrfs_dev_item -> btrfs_dev_extent */
 static int check_devices(struct rb_root *dev_cache,
 			 struct device_extent_tree *dev_extent_cache)
@@ -9896,6 +9957,11 @@ static int check_devices(struct rb_root *dev_cache,
 		if (err)
 			ret = err;
 
+		if (!IS_ALIGNED(dev_rec->total_byte, global_info->sectorsize)) {
+		}
+
+		check_dev_size_alignment(dev_rec->devid, dev_rec->total_byte,
+					 global_info->sectorsize);
 		dev_node = rb_next(dev_node);
 	}
 	list_for_each_entry(dext_rec, &dev_extent_cache->no_device_orphans,
@@ -11141,6 +11207,7 @@ static int check_dev_item(struct btrfs_fs_info *fs_info,
 	struct btrfs_path path;
 	struct btrfs_key key;
 	struct btrfs_dev_extent *ptr;
+	u64 total_bytes;
 	u64 dev_id;
 	u64 used;
 	u64 total = 0;
@@ -11149,6 +11216,7 @@ static int check_dev_item(struct btrfs_fs_info *fs_info,
 	dev_item = btrfs_item_ptr(eb, slot, struct btrfs_dev_item);
 	dev_id = btrfs_device_id(eb, dev_item);
 	used = btrfs_device_bytes_used(eb, dev_item);
+	total_bytes = btrfs_device_total_bytes(eb, dev_item);
 
 	key.objectid = dev_id;
 	key.type = BTRFS_DEV_EXTENT_KEY;
@@ -11193,6 +11261,8 @@ next:
 			BTRFS_DEV_EXTENT_KEY, dev_id);
 		return ACCOUNTING_MISMATCH;
 	}
+	check_dev_size_alignment(dev_id, total_bytes, fs_info->sectorsize);
+
 	return 0;
 }
 
@@ -13471,6 +13541,9 @@ int cmd_check(int argc, char **argv)
 		error(
 		"errors found in extent allocation tree or chunk allocation");
 
+	/* Only re-check super size after we checked and repaired the fs */
+	err |= check_super_size(info);
+
 	ret = repair_root_items(info);
 	err |= !!ret;
 	if (ret < 0) {
-- 
2.14.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/4] btrfs-progs: test/fsck: Add test case image for --fix-dev-size
  2017-10-10  7:51 [PATCH 0/4] btrfs-progs repair support for unaligned/mismatched device sizes Qu Wenruo
                   ` (2 preceding siblings ...)
  2017-10-10  7:51 ` [PATCH 3/4] btrfs-progs: check: Also check unalignment/mismatch device and super size Qu Wenruo
@ 2017-10-10  7:51 ` Qu Wenruo
  2017-10-10  8:15 ` [PATCH 0/4] btrfs-progs repair support for unaligned/mismatched device sizes Nikolay Borisov
  4 siblings, 0 replies; 14+ messages in thread
From: Qu Wenruo @ 2017-10-10  7:51 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba, yoasif, rrauenza

The image has 2 problems mixed:

1) Too small super total_bytes
   This super total_bytes is manually modified to create such problem.

2) Unaligned dev item total_bytes
   This is created by v4.12 kernel, with 128M + 2K device added, and
   original device removed.
   Then we can create such image with unaligned dev item total_bytes.

Signed-off-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
---
 .../dev_and_super_mismatch_unaligned.raw.xz             | Bin 0 -> 21536 bytes
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 tests/fsck-tests/027-unaligned-super-dev-sizes/dev_and_super_mismatch_unaligned.raw.xz

diff --git a/tests/fsck-tests/027-unaligned-super-dev-sizes/dev_and_super_mismatch_unaligned.raw.xz b/tests/fsck-tests/027-unaligned-super-dev-sizes/dev_and_super_mismatch_unaligned.raw.xz
new file mode 100644
index 0000000000000000000000000000000000000000..153e514a89d5f50a7e6b4f9c3d3214896a4070cd
GIT binary patch
literal 21536
zcmeI4XIN8d7RM6^O{57*lP*o^y(5T75s)4_2uSZmrAd*16zK>K7>W=;sUj#fA{`P0
zl_Jt45IV|8lalq>-FbFqlyzoTefV&{<w-u=oA;jczyI?)?|Jtm1E8iL5M*|+S``n(
z20jS_fhhZ+y*M14sK){X^4P=S@@sJy6x5D;OWWu^N#>c&hY@YBV3@;S3bx6}fi){b
zx?*Cq1oO#8ZWK<(aZu?`n$WfR#k{+^Rg-$-s55fhSKZDWzB+29Leoco59gckqI$ne
zXqe$v;@mTR-7@QAoGk*8Tt%{;zm{>4F@VBFB4sQkt;EMp8>1II5(Beim`-kRHEJZ3
z&}d@XGR=Z&bb>aJx2$GPQY|D^c672tTuWv4sb?%nb{KMg+MhsOW%=slLT%FND*IUr
z{~+EH<f1rqb$8n7Zl;fZk^Q`&*CMfv%FSLVyRo=f_}0fkr#17XP=C7$z0grUS${OD
zsEk|l@y7<C?rOR^l9*$eIQp}Xf}3!$rg*fRGK1IklFD+R9c&rDyf&$t-eY{iScoks
z4%gT#k!VS@$8Hi4K#qf4E^t~Sc+eCu<JmUjM_hxb@<35ImdbI;SO99YPQy+n?=Zv>
znvEpP(8|~wE`Oa!(aNd6wVos)Oi)m<;_=K=>%!6gB&w29Yo<DMB{!Y932gp~9<*uR
z|Dg$H#*Bb?yW>si?vCCWUlg+$tMc}kzWf%i9f@pc&uQ0*$FlER8wIRNC=@iWC?`^r
z<0w<MysnvSkqjasiGtM<WHP4Bi7%DI3PVOYq|+p$%V_k7R+T3WX?yfqDKbgx>sj!f
z?&T19k{IRAxJICn(y!h|@iXDA<T<*W$emIG$3}E#FS06*(u$r891-KxxmFaJ!!W9u
z<;Ayl(Y8#vpDnnHfIbC{TFcF9%u&oQ$BW0}W=I?+g$)~ZCMG=8>6eWKl)SVVC8gG>
z1yXz~10D8Lq{#c<86@UKMifp+m<%8*NSL%A1ZcMo%u%<f_~G~={gQ9{KkaNwndF#+
zsVl4v1zpP<<yDgDlSKAPYx8`td7P@I`;ON~?A&wxUTCX8oKq6>ZNXz+jlAMYV3DDH
z%G}=KRGX={C}t_nQD+j?PJ3%mLR?rh$!NL6^*fGuyRZReGu!@+;&Q_IsfhJh^Yur%
z>N5;FNn%KZM-vOWqC?G)JO?KoKzx)e?2RFYwrPMn@PvrMM~=EcL)FxVi5g`#LE)Fo
z1T_39tm?rPRPV*O4cOzJXs3*WlWSv&3g(M%G7K;CI>Khu8Iz?~%ow#`BX4k`iFR3!
ze3*hUYx5-@*2@KW@eG@!T~TokQrOtbSTiX632EaHDk`^6cR;zbZ3%5UJbbBW!}$?~
zSf;=BsEj4LHmrvD)MJ?0ZTbdg4)~>Z$t`F3a@v}j*!DJJRSf<oA~~$IRSpXaCv6IY
z*2cp?w&lrg5&vZM$P=<wC1$seP6V$tM8W0k^I5qCBbsaoAXN>SSoQ_O56AYVd<)~N
zuka^KGp0*d3hmCUnNylwJ&k+NbRTOV?vyCpk<mhNR<pL=apJ^$FFe`|A+6@ped3jI
zhDa@IBTY+2hDKnMg=WA+Ir-9HV!YAuUlrOWgpD9){2M>kHLK5CNWHThd0O(U)zwH>
zZT=G}bAnFuTAcZq=9LEv<9=K%8b<Q?XW4EQb&GFl^XBRqBz5L-K4-3u6x|}wM`Pp)
z-nP7Fp1Wh77g|fKWsdG@FuOI479;UAdy--Kl({Va>_tkCIqzH-6zD<(T$Hb3u{D4i
zTsmW)+2IxtHw;rcC3t-sE~ZV`ymIW-Q}KnAo%cQb0u;N)N#>5}52N~nw`WMtb8+`a
zZfRul<2c0GC&iYBjOd2th>~+50?R><Q1M6YoK8K-whPVeMOxQ|70e(BuAnDGxPu(n
zR=eF}3*GLYrfxoF+0Y^EuHo}$Ad2$_(FZ5bLiPLw_RUGWs$j?X$+f}0{VKn4f!VAh
z!Xwq^h{CNjpBLtL^<oin0>kUsq|?N7$jLF5t6(xZetT{Vah`VgJxoM}EbXAAjMr%E
z#AeCfM<&g6Rg}+SLv@rMR_*HjHT0-TYHtH9plNLx`Qp5<w>yhcGoqE9XWSSCEftKy
zODP_XS-G{ix5?*<2*{pG7Ty_YrnjA>wT>|qL^XAs+f*Nxxwam{=bZFJXKT17HB2u}
z_e3oFI}TX-PT_e(joP&>K~<*MChOawi3YjL;i{9F+$|D2Qg%AesFhz<?nc2D;&yOg
zmob^?@vL?~>a}J0o{)7DWG^4p?LpLem#C++8aK%nVCpM#WR)(3<^we@a(pS>+@s0l
zk?N(T6Ow}C&ImV^WhFB&9r?8Rwi@1;Yr^j5<fHn9;pRHOsIAO_^yTQNyoUHSOJ`YR
zV3w}K;-0B^Re|*{EiEcf@eta#SD3U%&GyRGI0}zHklJ&GLF26Pcq|acl;GzOav|bY
zMW-Z-z9VU&NQl(i`$cT6&qk8*M?WN8|8N$Z<MT<ECa?ifzd8yjI)WEb7=ANkTcg(R
zDtlikmWyKEiU-7U@$S^qnTe%d3DJQ1{_-QvQ&xQJviG%=u$PF*Iam8AI|C$j@~{^Y
zSdZ#7P4LCMZ?7E7BP<y3sU_MMbJW(fQhCULHk9NzBE@d<E5bbZxk-9MXO|F?4K^%C
z_M+p39M#xW@4`>J+2?~^TP(tqspTDVRJJ7ft;ZI<t1zk?2JfT&_=HAVE*T`T-yy`e
z>hhGOL5fIb6dVT^%;l4uUMj(vay|4udLkLoB%N?-%JUwA!)l4AJ5Ih3j-v#h0WJCH
zBleogf`l31weiC4|NgxvvOg^&CggO|8n@pj{m*?bk)m*EJgOAMyFueScZCFukO`5$
zMuZ{lf5I#A;7jXLC{_Ur_JxN47VK}Du5T>Z{x(?TP^KiW7fA^&-xo})*e++eph^zD
z8_Rq|;dp39$$#~j2mjsgW^eBW!1sX61TypgEr$*cRY3c{T0ee!!T1iNl%a{_>*~DA
z?*aDWFLHu|pYwOq@PiP=!IuVd@_)=p!14b9f&mfseGM#7Q~nvV0ucs8*pCJI--@t&
z@YZ28VyFd`wk%`n1fy-WrCCIJ>5@8EeY<N5gfN=ip@Kb%t#0cahE9Fuas}GqT)y=D
zsX5mOIP<FZBMv*qG)!SsF-D!uE!}kHrduQ}A;X$$z?&JR3@HAyo;<fVEM)%eXKa?k
zPKin74jcSpwyBkBW7~($?T%emvgLtT@snaDu~&E$ZB;NU_#xc7MvTxK6-qlx4Z$)H
z6Y!4Kz7k#<@yZ7JoKqs>>+mbW^0u;2NN9UoWDISv3KBei`EmvC+Hx&Tt3Xvfl4ia|
zoRTDSbzs2(^Q^*MRCR!yFZVIMsiE{#W1fPMeTdq}7rsUjSrGT$#*w+0(mVV(fyKMu
zv5|Ao^Uod%7XT__pfUz3<3om#6TzZ~QjkBInBt>e1Vy@>zXa`+s0w2Zl6aGHXXDlJ
zo|Yc5dS{W!5HD3PVh*qywXv8EGyErZ4K^FC?1}*`1GEg#vO}h2l;Fcj`9Kuzm`$VW
zMWP9pZw8y{UQYH^H*a;TO2+1NqKP0Y@bLoWo@hmLP{IrCaxR`!FBbX#bVq+w_w$mj
zI|U$OfQSJicIZS5z@h*a{eL!79#n(3asr(Spi}WZw(EW?!Y059pv8Nm)M4oknIVjt
z5fYQgj0JLA<VflbvL<hA+H}cR_JRE<r}NeQ$6aUpwHft}`1nWHrW-%3(VB8%^4QUg
zrgZ)JPV9^OJHune4_Gt6ngP}fux5W@cH&?$2KGe!&DTA!V8DX?SbqdW7!YAVgaMNr
z{~99stsXy=*cye)6mmj#K+bczk|kIVL^zq;zsPa{w_i$NgHnVF<;?b&Ycaku_Hiqm
z>fFE1_V}mZZF5{RFf8=t{OrLV@V5{HpdtVj0jPMW=JyVl>fcRbZhO!<6C)`2<C~9P
zAg`|+3GjPKOaG(=2Z$Iz!~h}&5HWy=egETEey_DagaHu-MA$!1rueBH$v>Tl1f=A5
z8`S}rE5KX<<_a*^KQdPUhyg$h0Ahy*h=nTIgNQ#*B8inHbk2M}=ab$%HA@7Za|;CC
cM1VsRL(hKxZo!YMo`zOd2><hE5X8jvA3Bau9smFU

literal 0
HcmV?d00001

-- 
2.14.2


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] btrfs-progs repair support for unaligned/mismatched device sizes
  2017-10-10  7:51 [PATCH 0/4] btrfs-progs repair support for unaligned/mismatched device sizes Qu Wenruo
                   ` (3 preceding siblings ...)
  2017-10-10  7:51 ` [PATCH 4/4] btrfs-progs: test/fsck: Add test case image for --fix-dev-size Qu Wenruo
@ 2017-10-10  8:15 ` Nikolay Borisov
  2017-10-10  8:31   ` Qu Wenruo
  4 siblings, 1 reply; 14+ messages in thread
From: Nikolay Borisov @ 2017-10-10  8:15 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs; +Cc: dsterba, yoasif, rrauenza



On 10.10.2017 10:51, Qu Wenruo wrote:
> The patchset can be fetched from github:
> https://github.com/adam900710/btrfs-progs/tree/check_unaligned_dev
> 
> There are several reports in mail list for btrfs device size related
> problems.
> 
> 1) Unmountable fs, due to mismatched super total_bytes
>    Unmountable if super total_bytes is smaller than total rw bytes of
>    all devices.
>    Root cause under investigation, but only one report here.

Don't you mean mountable? We've internally seen mounting being a problem
due to said size mismatch.

> 
>    This patchset provides the tool to fix it offline.
>    (At least better than unmountable forever)
> 
> 2) Harmless kernel warning for btrfs_update_device()
>    v4.14 introduced restrict device size checker.
>    This somewhat break the backward compatibility and causing kernel
>    warning.
> 
>    It can be fixed online with "btrfs filesystem resize".
>    (Although it is better to fixed it at mount time)
> 
>    This patchset also provide a fallback method to fix it.
> 
> Qu Wenruo (4):
>   btrfs-progs: Introduce functions to repair unaligned/mismatch device
>     size
>   btrfs-progs: fsck: Introduce --fix-dev-size option
>   btrfs-progs: check: Also check unalignment/mismatch device and super
>     size
>   btrfs-progs: test/fsck: Add test case image for --fix-dev-size
> 
> Qu Wenruo (4):
>   btrfs-progs: Introduce functions to repair unaligned/mismatch device
>     size
>   btrfs-progs: fsck: Introduce --fix-dev-size option
>   btrfs-progs: check: Also check unalignment/mismatch device and super
>     size
>   btrfs-progs: test/fsck: Add test case image for --fix-dev-size
> 
>  Documentation/btrfs-check.asciidoc                 |  23 ++
>  cmds-check.c                                       | 292 ++++++++++++++++++++-
>  .../dev_and_super_mismatch_unaligned.raw.xz        | Bin 0 -> 21536 bytes
>  3 files changed, 314 insertions(+), 1 deletion(-)
>  create mode 100644 tests/fsck-tests/027-unaligned-super-dev-sizes/dev_and_super_mismatch_unaligned.raw.xz
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/4] btrfs-progs: Introduce functions to repair unaligned/mismatch device size
  2017-10-10  7:51 ` [PATCH 1/4] btrfs-progs: Introduce functions to repair unaligned/mismatch device size Qu Wenruo
@ 2017-10-10  8:24   ` Nikolay Borisov
  0 siblings, 0 replies; 14+ messages in thread
From: Nikolay Borisov @ 2017-10-10  8:24 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs; +Cc: dsterba, yoasif, rrauenza



On 10.10.2017 10:51, Qu Wenruo wrote:
> This patch introduces functions to repair device size related problems,
> including:
> 1) Unaligned total_bytes of dev_item
>    v4.14-rc kernel introduced total_bytes alignment checker.
>    However older kernel device add/shrink doesn't align these members.
>    This will cause kernel warning every time dev_item get updated.
> 
>    Although it can be fixed by shrinking device on latest kernel or
>    use manually aligned size on older kernel, a fallback method in
>    btrfs-progs won't hurt.
> 
> 2) Mismatch super->total_bytes
>    There are some reports about unmountable fs, due to mismatched
>    super->total_bytes.
>    And normal kernel device shrink won't help since it only modify the
>    total_bytes by the size changed, not re-calculating it.
> 
>    The root cause is still under investigation, but at least to fix it
>    in btrfs-progs
> 
> Fix all of them by manually rounding down total_bytes of each device, and
> recalculate super->total_bytes using all existing devices.

Thanks for doing this, we needed it. One minor nit in the comments but
overall:

Reviewed-by: Nikolay Borisov <nborisov@suse.com>


> 
> Reported-by: Asif Youssuff <yoasif@gmail.com>
> Reported-by: Rich Rauenzahn <rrauenza@gmail.com>
> Signed-off-by: Qu Wenruo <quwenruo.btrfs@gmx.com>> ---
>  cmds-check.c | 191 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 191 insertions(+)
> 
> diff --git a/cmds-check.c b/cmds-check.c
> index 0c08618ed701..007781fa5d1b 100644
> --- a/cmds-check.c
> +++ b/cmds-check.c
> @@ -12885,6 +12885,197 @@ close_out:
>  	return ret;
>  }
>  
> +/*
> + * Return 0 if DEV_ITEM for @device is good
> + * Return >0 if DEV_ITEM for @device has unaligned value and fixed
> + * Return <0 if we failed to fix the unaligned value
> + */
> +static int reset_one_dev_size(struct btrfs_fs_info *fs_info,
> +			       struct btrfs_device *device)
> +{
> +	struct btrfs_trans_handle *trans;
> +	struct btrfs_key key;
> +	struct btrfs_path path;
> +	struct btrfs_root *chunk_root = fs_info->chunk_root;
> +	struct btrfs_dev_item *di;
> +	u64 old_bytes = device->total_bytes;
> +	int ret;
> +
> +	if (IS_ALIGNED(device->total_bytes, fs_info->sectorsize))
> +		return 0;
> +
> +	/* Align the in-memory total_bytes first, and use it later */
> +	device->total_bytes = round_down(device->total_bytes,
> +					 fs_info->sectorsize);
> +
> +	key.objectid = BTRFS_DEV_ITEMS_OBJECTID;
> +	key.type = BTRFS_DEV_ITEM_KEY;
> +	key.offset = device->devid;
> +
> +	trans = btrfs_start_transaction(chunk_root, 1);
> +	if (IS_ERR(trans)) {
> +		ret = PTR_ERR(trans);
> +		error("error starting transaction:  %d (%s)",
> +		      ret, strerror(-ret));
> +		return ret;
> +	}
> +
> +	btrfs_init_path(&path);
> +	ret = btrfs_search_slot(trans, chunk_root, &key, &path, 0, 1);
> +	if (ret > 0) {
> +		error("failed to find DEV_ITEM for devid %llu",
> +			device->devid);
> +		ret = -ENOENT;
> +		goto err;
> +	}
> +	if (ret < 0) {
> +		error("failed to search chunk root: %d (%s)",
> +			ret, strerror(-ret));
> +		goto err;
> +	}
> +	di = btrfs_item_ptr(path.nodes[0], path.slots[0],
> +			    struct btrfs_dev_item);
> +	btrfs_set_device_total_bytes(path.nodes[0], di, device->total_bytes);
> +	btrfs_mark_buffer_dirty(path.nodes[0]);
> +	ret = btrfs_commit_transaction(trans, chunk_root);
> +	if (ret < 0) {
> +		error("failed to commit current transaction: %d (%s)",
> +			ret, strerror(-ret));
> +		btrfs_release_path(&path);
> +		return ret;
> +	}
> +	btrfs_release_path(&path);
> +	printf("Fixed device size for devid %llu, old size: %llu new size: %llu\n",
> +		device->devid, old_bytes, device->total_bytes);
> +	return 1;
> +
> +err:
> +	/* We haven't modified anything, it's OK to commit current trans */
> +	btrfs_commit_transaction(trans, chunk_root);
> +	btrfs_release_path(&path);
> +	return ret;
> +}
> +
> +/*
> + * Return 0 if super block total bytes matches with device total_bytes
> + * Return >0 if super block has mismatch total_bytes and fixed
> + * Return <0 if we failed to fix the mismatch total_bytes
> + */
> +static int reset_super_total_bytes(struct btrfs_fs_info *fs_info)
> +{
> +	struct btrfs_trans_handle *trans;
> +	struct btrfs_device *device;
> +	struct list_head *dev_list = &fs_info->fs_devices->devices;
> +	u64 total_bytes = 0;
> +	u64 old_bytes = btrfs_super_total_bytes(fs_info->super_copy);
> +	int ret;
> +
> +	list_for_each_entry(device, dev_list, dev_list)
> +		total_bytes += device->total_bytes;
> +
> +	if (total_bytes == old_bytes)
> +		return 0;
> +
> +	/* Just in case */
> +	if (!IS_ALIGNED(total_bytes, fs_info->sectorsize)) {
> +		error("final total_bytes still not aligned to %u, please report a bug to btrfs mail list",
> +			fs_info->sectorsize);
> +		return -EUCLEAN;
> +	}
> +
> +	btrfs_set_super_total_bytes(fs_info->super_copy, total_bytes);
> +
> +	/* Commit transaction to update all super blocks */
> +	trans = btrfs_start_transaction(fs_info->tree_root, 1);
> +	if (IS_ERR(trans)) {
> +		ret = PTR_ERR(trans);
> +		error("error starting transaction:  %d (%s)",
> +		      ret, strerror(-ret));
> +		return ret;
> +	}
> +	ret = btrfs_commit_transaction(trans, fs_info->tree_root);
> +	if (ret < 0) {
> +		error("failed to commit current transaction: %d (%s)",
> +			ret, strerror(-ret));
> +		return ret;
> +	}
> +	printf("Fixed super total bytes, old size: %llu new size: %llu\n",
> +		old_bytes, total_bytes);
> +	return 1;
> +}
> +
> +/*
> + * Repair device size related problems, including:
> + * 1) Unaligned total_bytes of dev_item
> + *    v4.14-rc kernel introduced total_bytes alignment checker.
> + *    However older kernel device add/shrink doesn't restrictly align
> + *    these members.
> + *    This will cause kernel warning everytime dev_item get updated.
> + *
> + *    Although it can be fixed by shrinking device on latest kernel or
> + *    use manually aligned size on older kernel, a fallback method in
> + *    btrfs-progs won't hurt.
> + *
> + * 2) Mismatch super->total_bytes
> + *    Due to similar situation, super->total_bytes can be mismatch with
> + *    total size of all devices.
> + *    This will cause the filesystem unable to be mounted with v4.14-rc kernel.

The commit which introduced the code preventing mounting can be traced
back to 4.7: 99e3ecfcb9f4 ("Btrfs: add more validation checks for
superblock"). I'd say remove the version information or fix it to point
to v4.7.

> + *
> + *    Add such fixer in btrfs-progs as a fallback method.
> + *
> + * Fix all of them by manually rounding down total_bytes of each device, and
> + * recalculate super->total_bytes using all existing devices.
> + */
> +static int reset_devs_size(struct btrfs_fs_info *fs_info)
> +{
> +	struct btrfs_device *device;
> +	struct list_head *dev_list = &fs_info->fs_devices->devices;
> +	bool have_bad_value = false;
> +	int ret;
> +
> +	/* Seed device is not support yet */
> +	if (fs_info->fs_devices->seed) {
> +		error("resetting device size with seed device is not supported yet");
> +		return -ENOTTY;
> +	}
> +
> +	/* All devices must be on-line before reparing */
> +	if (list_empty(dev_list)) {
> +		error("no device found");
> +		return -ENODEV;
> +	}
> +	list_for_each_entry(device, dev_list, dev_list) {
> +		if (device->fd == -1 || !device->writeable) {
> +			error("device with devid %llu is missing or not writeable",
> +			      device->devid);
> +			error("resetting device size needs all device(s) present and writeable");
> +			return -ENODEV;
> +		}
> +	}
> +
> +	/* Repair total_bytes of each device */
> +	list_for_each_entry(device, dev_list, dev_list) {
> +		ret = reset_one_dev_size(fs_info, device);
> +		if (ret < 0)
> +			return ret;
> +		if (ret > 0)
> +			have_bad_value = true;
> +	}
> +
> +	/* Repair super total_byte */
> +	ret = reset_super_total_bytes(fs_info);
> +	if (ret > 0)
> +		have_bad_value = true;
> +	if (have_bad_value) {
> +		printf("Fixed unaligned/mismatch total_bytes for superblock and device item\n");
> +		ret = 1;
> +	} else {
> +		printf("No device size related problem found\n");
> +		ret = 0;
> +	}
> +	return ret;
> +}
> +
>  const char * const cmd_check_usage[] = {
>  	"btrfs check [options] <device>",
>  	"Check structural integrity of a filesystem (unmounted).",
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/4] btrfs-progs: check: Also check unalignment/mismatch device and super size
  2017-10-10  7:51 ` [PATCH 3/4] btrfs-progs: check: Also check unalignment/mismatch device and super size Qu Wenruo
@ 2017-10-10  8:31   ` Nikolay Borisov
  2017-10-10  8:34     ` Qu Wenruo
  0 siblings, 1 reply; 14+ messages in thread
From: Nikolay Borisov @ 2017-10-10  8:31 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs; +Cc: dsterba, yoasif, rrauenza



On 10.10.2017 10:51, Qu Wenruo wrote:
> Along with the fix introduced, also introduce check for them.
> Unlike normal check funtions, some of the check is optional, and even if
> the image failed to pass optional check, kernel can still run fine.
> (But may cause noisy kernel warning)
> 
> So some check, mainly for alignment, will not cause btrfs check to fail,
> but only to output warning and how to fix it.
> 
> Signed-off-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
> ---
>  cmds-check.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 73 insertions(+)
> 
> diff --git a/cmds-check.c b/cmds-check.c
> index fdb6d832eee1..23dd3b51102b 100644
> --- a/cmds-check.c
> +++ b/cmds-check.c
> @@ -9879,6 +9879,67 @@ static int check_device_used(struct device_record *dev_rec,
>  	}
>  }
>  
> +/*
> + * Extra (optional) check for dev_item size
> + *
> + * To avoid possible kernel warning for v4.14 kernel.
> + * It's not a deadly problem, just to info v4.14 kernel user.
> + * So it won't change the return value.
> + */
> +static void check_dev_size_alignment(u64 devid, u64 total_bytes,
> +				     u32 sectorsize)
> +{
> +	if (!IS_ALIGNED(total_bytes, sectorsize)) {
> +		warning("unaligned total_bytes detected for devid %llu, have %llu should be aligned to %u",
> +			devid, total_bytes, sectorsize);
> +		warning("this is OK for older kernel (<4.14), but may cause kernel warning for newer kernel (>=4.14)");
> +		warning("this can be fixed by 'btrfs check --fix-dev-size'");
> +	}
> +}
> +
> +/*
> + * Unlike device size alignment check above, some super total_bytes check
> + * failure can lead to unmountable fs for kernel >=v4.6.
> + *
> + * So this function will return the error for fatal super total_bytes problem.
> + */
> +static int check_super_size(struct btrfs_fs_info *fs_info)
> +{
> +	struct btrfs_device *dev;
> +	struct list_head *dev_list = &fs_info->fs_devices->devices;
> +	u64 total_bytes = 0;
> +	u64 super_bytes = btrfs_super_total_bytes(fs_info->super_copy);
> +
> +	list_for_each_entry(dev, dev_list, dev_list)
> +		total_bytes += dev->total_bytes;
> +
> +	/* Important check, which can cause unmountable fs */
> +	if (super_bytes < total_bytes) {
> +		error("super total bytes %llu smaller than real devices size %llu",
> +			super_bytes, total_bytes);
> +		error("this fs will not be mountable for newer kernels (>=v4.6)");
> +		error("this can be fixed by 'btrfs check --fix-dev-size'");
> +		return 1;
> +	}
> +
> +	/*
> +	 * Optional check, just to make everything aligned and match with
> +	 * each other.
> +	 *
> +	 * For btrfs-image restored fs, we don't need to check it anyway.
> +	 */
> +	if (btrfs_super_flags(fs_info->super_copy) &
> +	    (BTRFS_SUPER_FLAG_METADUMP | BTRFS_SUPER_FLAG_METADUMP_V2))
> +		return 0;
> +	if (!IS_ALIGNED(super_bytes, fs_info->sectorsize) ||
> +	    !IS_ALIGNED(total_bytes, fs_info->sectorsize) ||
> +	    super_bytes != total_bytes) {
> +		warning("minor unaligned/mismatch device size detected");
> +		warning("recommended to use 'btrfs check --fix-dev-size' to fix it");
> +	}
> +	return 0;
> +}

nit: Make the function return bool and perhaps rename it to
"is_super_size_aligned" or something like that ?

> +
>  /* check btrfs_dev_item -> btrfs_dev_extent */
>  static int check_devices(struct rb_root *dev_cache,
>  			 struct device_extent_tree *dev_extent_cache)
> @@ -9896,6 +9957,11 @@ static int check_devices(struct rb_root *dev_cache,
>  		if (err)
>  			ret = err;
>  
> +		if (!IS_ALIGNED(dev_rec->total_byte, global_info->sectorsize)) {
> +		}
> +
> +		check_dev_size_alignment(dev_rec->devid, dev_rec->total_byte,
> +					 global_info->sectorsize);
>  		dev_node = rb_next(dev_node);

Something funny is going on here, the body of the if is empty, perhaps
those function have to be inside?


>  	}
>  	list_for_each_entry(dext_rec, &dev_extent_cache->no_device_orphans,
> @@ -11141,6 +11207,7 @@ static int check_dev_item(struct btrfs_fs_info *fs_info,
>  	struct btrfs_path path;
>  	struct btrfs_key key;
>  	struct btrfs_dev_extent *ptr;
> +	u64 total_bytes;
>  	u64 dev_id;
>  	u64 used;
>  	u64 total = 0;
> @@ -11149,6 +11216,7 @@ static int check_dev_item(struct btrfs_fs_info *fs_info,
>  	dev_item = btrfs_item_ptr(eb, slot, struct btrfs_dev_item);
>  	dev_id = btrfs_device_id(eb, dev_item);
>  	used = btrfs_device_bytes_used(eb, dev_item);
> +	total_bytes = btrfs_device_total_bytes(eb, dev_item);
>  
>  	key.objectid = dev_id;
>  	key.type = BTRFS_DEV_EXTENT_KEY;
> @@ -11193,6 +11261,8 @@ next:
>  			BTRFS_DEV_EXTENT_KEY, dev_id);
>  		return ACCOUNTING_MISMATCH;
>  	}
> +	check_dev_size_alignment(dev_id, total_bytes, fs_info->sectorsize);
> +
>  	return 0;
>  }
>  
> @@ -13471,6 +13541,9 @@ int cmd_check(int argc, char **argv)
>  		error(
>  		"errors found in extent allocation tree or chunk allocation");
>  
> +	/* Only re-check super size after we checked and repaired the fs */
> +	err |= check_super_size(info);
> +
>  	ret = repair_root_items(info);
>  	err |= !!ret;
>  	if (ret < 0) {
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] btrfs-progs repair support for unaligned/mismatched device sizes
  2017-10-10  8:15 ` [PATCH 0/4] btrfs-progs repair support for unaligned/mismatched device sizes Nikolay Borisov
@ 2017-10-10  8:31   ` Qu Wenruo
  0 siblings, 0 replies; 14+ messages in thread
From: Qu Wenruo @ 2017-10-10  8:31 UTC (permalink / raw)
  To: Nikolay Borisov, linux-btrfs; +Cc: dsterba, yoasif, rrauenza



On 2017年10月10日 16:15, Nikolay Borisov wrote:
> 
> 
> On 10.10.2017 10:51, Qu Wenruo wrote:
>> The patchset can be fetched from github:
>> https://github.com/adam900710/btrfs-progs/tree/check_unaligned_dev
>>
>> There are several reports in mail list for btrfs device size related
>> problems.
>>
>> 1) Unmountable fs, due to mismatched super total_bytes
>>     Unmountable if super total_bytes is smaller than total rw bytes of
>>     all devices.
>>     Root cause under investigation, but only one report here.
> 
> Don't you mean mountable? We've internally seen mounting being a problem
> due to said size mismatch.

I mean the fs is unable to be mounted.
(Which is the correct single word to express "unable to be mounted"?)

The direct cause is obvious, strict super block total_bytes checker.

But I don't know how it happened.

Maybe there is a window between some kernel release make it out of sync, 
but anyway, your introduced alignment enhancement commit should handle 
them well in v4.14.

At least for worst case, such users with fs unable to mount, won't be 
left in cold wind.
And new users starting from v4.13 should not hit such problems.

Thanks,
Qu

> 
>>
>>     This patchset provides the tool to fix it offline.
>>     (At least better than unmountable forever)
>>
>> 2) Harmless kernel warning for btrfs_update_device()
>>     v4.14 introduced restrict device size checker.
>>     This somewhat break the backward compatibility and causing kernel
>>     warning.
>>
>>     It can be fixed online with "btrfs filesystem resize".
>>     (Although it is better to fixed it at mount time)
>>
>>     This patchset also provide a fallback method to fix it.
>>
>> Qu Wenruo (4):
>>    btrfs-progs: Introduce functions to repair unaligned/mismatch device
>>      size
>>    btrfs-progs: fsck: Introduce --fix-dev-size option
>>    btrfs-progs: check: Also check unalignment/mismatch device and super
>>      size
>>    btrfs-progs: test/fsck: Add test case image for --fix-dev-size
>>
>> Qu Wenruo (4):
>>    btrfs-progs: Introduce functions to repair unaligned/mismatch device
>>      size
>>    btrfs-progs: fsck: Introduce --fix-dev-size option
>>    btrfs-progs: check: Also check unalignment/mismatch device and super
>>      size
>>    btrfs-progs: test/fsck: Add test case image for --fix-dev-size
>>
>>   Documentation/btrfs-check.asciidoc                 |  23 ++
>>   cmds-check.c                                       | 292 ++++++++++++++++++++-
>>   .../dev_and_super_mismatch_unaligned.raw.xz        | Bin 0 -> 21536 bytes
>>   3 files changed, 314 insertions(+), 1 deletion(-)
>>   create mode 100644 tests/fsck-tests/027-unaligned-super-dev-sizes/dev_and_super_mismatch_unaligned.raw.xz
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/4] btrfs-progs: check: Also check unalignment/mismatch device and super size
  2017-10-10  8:31   ` Nikolay Borisov
@ 2017-10-10  8:34     ` Qu Wenruo
  0 siblings, 0 replies; 14+ messages in thread
From: Qu Wenruo @ 2017-10-10  8:34 UTC (permalink / raw)
  To: Nikolay Borisov, linux-btrfs; +Cc: dsterba, yoasif, rrauenza



On 2017年10月10日 16:31, Nikolay Borisov wrote:
> 
> 
> On 10.10.2017 10:51, Qu Wenruo wrote:
>> Along with the fix introduced, also introduce check for them.
>> Unlike normal check funtions, some of the check is optional, and even if
>> the image failed to pass optional check, kernel can still run fine.
>> (But may cause noisy kernel warning)
>>
>> So some check, mainly for alignment, will not cause btrfs check to fail,
>> but only to output warning and how to fix it.
>>
>> Signed-off-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
>> ---
>>   cmds-check.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 73 insertions(+)
>>
>> diff --git a/cmds-check.c b/cmds-check.c
>> index fdb6d832eee1..23dd3b51102b 100644
>> --- a/cmds-check.c
>> +++ b/cmds-check.c
>> @@ -9879,6 +9879,67 @@ static int check_device_used(struct device_record *dev_rec,
>>   	}
>>   }
>>   
>> +/*
>> + * Extra (optional) check for dev_item size
>> + *
>> + * To avoid possible kernel warning for v4.14 kernel.
>> + * It's not a deadly problem, just to info v4.14 kernel user.
>> + * So it won't change the return value.
>> + */
>> +static void check_dev_size_alignment(u64 devid, u64 total_bytes,
>> +				     u32 sectorsize)
>> +{
>> +	if (!IS_ALIGNED(total_bytes, sectorsize)) {
>> +		warning("unaligned total_bytes detected for devid %llu, have %llu should be aligned to %u",
>> +			devid, total_bytes, sectorsize);
>> +		warning("this is OK for older kernel (<4.14), but may cause kernel warning for newer kernel (>=4.14)");
>> +		warning("this can be fixed by 'btrfs check --fix-dev-size'");
>> +	}
>> +}
>> +
>> +/*
>> + * Unlike device size alignment check above, some super total_bytes check
>> + * failure can lead to unmountable fs for kernel >=v4.6.
>> + *
>> + * So this function will return the error for fatal super total_bytes problem.
>> + */
>> +static int check_super_size(struct btrfs_fs_info *fs_info)
>> +{
>> +	struct btrfs_device *dev;
>> +	struct list_head *dev_list = &fs_info->fs_devices->devices;
>> +	u64 total_bytes = 0;
>> +	u64 super_bytes = btrfs_super_total_bytes(fs_info->super_copy);
>> +
>> +	list_for_each_entry(dev, dev_list, dev_list)
>> +		total_bytes += dev->total_bytes;
>> +
>> +	/* Important check, which can cause unmountable fs */
>> +	if (super_bytes < total_bytes) {
>> +		error("super total bytes %llu smaller than real devices size %llu",
>> +			super_bytes, total_bytes);
>> +		error("this fs will not be mountable for newer kernels (>=v4.6)");
>> +		error("this can be fixed by 'btrfs check --fix-dev-size'");
>> +		return 1;
>> +	}
>> +
>> +	/*
>> +	 * Optional check, just to make everything aligned and match with
>> +	 * each other.
>> +	 *
>> +	 * For btrfs-image restored fs, we don't need to check it anyway.
>> +	 */
>> +	if (btrfs_super_flags(fs_info->super_copy) &
>> +	    (BTRFS_SUPER_FLAG_METADUMP | BTRFS_SUPER_FLAG_METADUMP_V2))
>> +		return 0;
>> +	if (!IS_ALIGNED(super_bytes, fs_info->sectorsize) ||
>> +	    !IS_ALIGNED(total_bytes, fs_info->sectorsize) ||
>> +	    super_bytes != total_bytes) {
>> +		warning("minor unaligned/mismatch device size detected");
>> +		warning("recommended to use 'btrfs check --fix-dev-size' to fix it");
>> +	}
>> +	return 0;
>> +}
> 
> nit: Make the function return bool and perhaps rename it to
> "is_super_size_aligned" or something like that ?
> 
>> +
>>   /* check btrfs_dev_item -> btrfs_dev_extent */
>>   static int check_devices(struct rb_root *dev_cache,
>>   			 struct device_extent_tree *dev_extent_cache)
>> @@ -9896,6 +9957,11 @@ static int check_devices(struct rb_root *dev_cache,
>>   		if (err)
>>   			ret = err;
>>   
>> +		if (!IS_ALIGNED(dev_rec->total_byte, global_info->sectorsize)) {
>> +		}
>> +
>> +		check_dev_size_alignment(dev_rec->devid, dev_rec->total_byte,
>> +					 global_info->sectorsize);
>>   		dev_node = rb_next(dev_node);
> 
> Something funny is going on here, the body of the if is empty, perhaps
> those function have to be inside?

Oh...

That's embarrassing, that's the old code where I didn't encapsulate the 
check into check_dev_size_alignment().

I'll update the patchset to address all your comment.

Thanks,
Qu
> 
> 
>>   	}
>>   	list_for_each_entry(dext_rec, &dev_extent_cache->no_device_orphans,
>> @@ -11141,6 +11207,7 @@ static int check_dev_item(struct btrfs_fs_info *fs_info,
>>   	struct btrfs_path path;
>>   	struct btrfs_key key;
>>   	struct btrfs_dev_extent *ptr;
>> +	u64 total_bytes;
>>   	u64 dev_id;
>>   	u64 used;
>>   	u64 total = 0;
>> @@ -11149,6 +11216,7 @@ static int check_dev_item(struct btrfs_fs_info *fs_info,
>>   	dev_item = btrfs_item_ptr(eb, slot, struct btrfs_dev_item);
>>   	dev_id = btrfs_device_id(eb, dev_item);
>>   	used = btrfs_device_bytes_used(eb, dev_item);
>> +	total_bytes = btrfs_device_total_bytes(eb, dev_item);
>>   
>>   	key.objectid = dev_id;
>>   	key.type = BTRFS_DEV_EXTENT_KEY;
>> @@ -11193,6 +11261,8 @@ next:
>>   			BTRFS_DEV_EXTENT_KEY, dev_id);
>>   		return ACCOUNTING_MISMATCH;
>>   	}
>> +	check_dev_size_alignment(dev_id, total_bytes, fs_info->sectorsize);
>> +
>>   	return 0;
>>   }
>>   
>> @@ -13471,6 +13541,9 @@ int cmd_check(int argc, char **argv)
>>   		error(
>>   		"errors found in extent allocation tree or chunk allocation");
>>   
>> +	/* Only re-check super size after we checked and repaired the fs */
>> +	err |= check_super_size(info);
>> +
>>   	ret = repair_root_items(info);
>>   	err |= !!ret;
>>   	if (ret < 0) {
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/4] btrfs-progs: fsck: Introduce --fix-dev-size option
  2017-10-10  7:51 ` [PATCH 2/4] btrfs-progs: fsck: Introduce --fix-dev-size option Qu Wenruo
@ 2017-10-10 13:16   ` David Sterba
  2017-10-11  0:43     ` Qu Wenruo
  0 siblings, 1 reply; 14+ messages in thread
From: David Sterba @ 2017-10-10 13:16 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, dsterba, yoasif, rrauenza

On Tue, Oct 10, 2017 at 07:51:11AM +0000, Qu Wenruo wrote:
> Introduce --fix-dev-size option to fix device related problems.

Please don't add it to 'check', this is not the right place for the
targeted fixes. -> 'btrfs rescue'

> This repairing  is also included in --repair, but considering the safety
> and potential affected users, it's better to provide a option to fix and
> only fix device size related problem to avoid full --repair.
> 
> Reported-by: Asif Youssuff <yoasif@gmail.com>
> Reported-by: Rich Rauenzahn <rrauenza@gmail.com>
> Signed-off-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
> ---
>  Documentation/btrfs-check.asciidoc | 23 +++++++++++++++++++++++
>  cmds-check.c                       | 28 +++++++++++++++++++++++++++-
>  2 files changed, 50 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/btrfs-check.asciidoc b/Documentation/btrfs-check.asciidoc
> index fbf48847ac25..e45c7a457bac 100644
> --- a/Documentation/btrfs-check.asciidoc
> +++ b/Documentation/btrfs-check.asciidoc
> @@ -93,6 +93,29 @@ the entire free space cache. This option with 'v2' provides an alternative
>  method of clearing the free space cache that doesn't require mounting the
>  filesystem.
>  
> +--fix-dev-size::
> +From v4.14-rc kernels, a more restrict device size checker is introduced, while
> +old kernel doesn't strictly align its device size, so it may cause noisy kernel
> +warning for newer kernel, like:
> ++
> +....
> +WARNING: CPU: 3 PID: 439 at fs/btrfs/ctree.h:1559 btrfs_update_device+0x1c5/0x1d0 [btrfs]
> +....
> ++
> +And for some case where super block total device size may mismatch with all
> +devices, and the filesystem will be unable to be mounted, with kernel message
> +like:
> ++
> +....
> +BTRFS error (device sdb): super_total_bytes 92017859088384 mismatch with fs_devices total_rw_bytes 92017859094528 
> +....
> ++
> +This option will fix both problems by aligning all size of devices, and
> +re-calculating superblock total bytes.
> ++
> +Although such repairing is included in *--repair* option, considering the
> +safety of *--repair*, this option is provided to suppress all other dangerous
> +repairing and only fix device sizes related problems.
>  
>  DANGEROUS OPTIONS
>  -----------------
> diff --git a/cmds-check.c b/cmds-check.c
> index 007781fa5d1b..fdb6d832eee1 100644
> --- a/cmds-check.c
> +++ b/cmds-check.c
> @@ -11746,6 +11746,8 @@ out:
>  	return err;
>  }
>  
> +static int reset_devs_size(struct btrfs_fs_info *fs_info);
> +
>  static int do_check_chunks_and_extents(struct btrfs_fs_info *fs_info)
>  {
>  	int ret;
> @@ -11756,6 +11758,12 @@ static int do_check_chunks_and_extents(struct btrfs_fs_info *fs_info)
>  		ret = check_chunks_and_extents_v2(fs_info);
>  	else
>  		ret = check_chunks_and_extents(fs_info);
> +	/* Also repair device sizes if needed */
> +	if (repair && !ret) {
> +		ret = reset_devs_size(fs_info);
> +		if (ret > 0)
> +			ret = 0;
> +	}
>  
>  	return ret;
>  }
> @@ -13088,6 +13096,8 @@ const char * const cmd_check_usage[] = {
>  	"-b|--backup                 use the first valid backup root copy",
>  	"--force                     skip mount checks, repair is not possible",
>  	"--repair                    try to repair the filesystem",
> +	"--fix-dev-size              repair device size related problem",
> +	"                            will not trigger other repair",
>  	"--readonly                  run in read-only mode (default)",
>  	"--init-csum-tree            create a new CRC tree",
>  	"--init-extent-tree          create a new extent tree",
> @@ -13128,6 +13138,7 @@ int cmd_check(int argc, char **argv)
>  	int qgroups_repaired = 0;
>  	unsigned ctree_flags = OPEN_CTREE_EXCLUSIVE;
>  	int force = 0;
> +	bool fix_dev_size = false;
>  
>  	while(1) {
>  		int c;
> @@ -13135,7 +13146,7 @@ int cmd_check(int argc, char **argv)
>  			GETOPT_VAL_INIT_EXTENT, GETOPT_VAL_CHECK_CSUM,
>  			GETOPT_VAL_READONLY, GETOPT_VAL_CHUNK_TREE,
>  			GETOPT_VAL_MODE, GETOPT_VAL_CLEAR_SPACE_CACHE,
> -			GETOPT_VAL_FORCE };
> +			GETOPT_VAL_FORCE, GETOPT_VAL_FIX_DEV_SIZE };
>  		static const struct option long_options[] = {
>  			{ "super", required_argument, NULL, 's' },
>  			{ "repair", no_argument, NULL, GETOPT_VAL_REPAIR },
> @@ -13158,6 +13169,8 @@ int cmd_check(int argc, char **argv)
>  			{ "clear-space-cache", required_argument, NULL,
>  				GETOPT_VAL_CLEAR_SPACE_CACHE},
>  			{ "force", no_argument, NULL, GETOPT_VAL_FORCE },
> +			{ "fix-dev-size", no_argument, NULL,
> +				GETOPT_VAL_FIX_DEV_SIZE },
>  			{ NULL, 0, NULL, 0}
>  		};
>  
> @@ -13245,6 +13258,11 @@ int cmd_check(int argc, char **argv)
>  			case GETOPT_VAL_FORCE:
>  				force = 1;
>  				break;
> +			case GETOPT_VAL_FIX_DEV_SIZE:
> +				fix_dev_size = true;
> +				repair = 1;
> +				ctree_flags |= OPEN_CTREE_WRITES;
> +				break;
>  		}
>  	}
>  
> @@ -13371,6 +13389,14 @@ int cmd_check(int argc, char **argv)
>  			report_qgroups(1);
>  		goto close_out;
>  	}
> +
> +	if (fix_dev_size) {
> +		ret = reset_devs_size(info);
> +		if (ret > 0)
> +			ret = 0;
> +		err |= !!ret;
> +		goto close_out;
> +	}
>  	if (subvolid) {
>  		printf("Print extent state for subvolume %llu on %s\nUUID: %s\n",
>  		       subvolid, argv[optind], uuidbuf);
> -- 
> 2.14.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/4] btrfs-progs: fsck: Introduce --fix-dev-size option
  2017-10-10 13:16   ` David Sterba
@ 2017-10-11  0:43     ` Qu Wenruo
  2017-10-26 18:58       ` David Sterba
  0 siblings, 1 reply; 14+ messages in thread
From: Qu Wenruo @ 2017-10-11  0:43 UTC (permalink / raw)
  To: dsterba, linux-btrfs, yoasif, rrauenza



On 2017年10月10日 21:16, David Sterba wrote:
> On Tue, Oct 10, 2017 at 07:51:11AM +0000, Qu Wenruo wrote:
>> Introduce --fix-dev-size option to fix device related problems.
> 
> Please don't add it to 'check', this is not the right place for the
> targeted fixes. -> 'btrfs rescue'

I'm OK moving the super total_bytes fix to 'btrfs rescue'.

But what about the alignment/mismatch detection part?
Is it still OK to detect them in 'btrfs check'?

And further more, the unaligned device total_bytes problem is not a big 
problem that fits into 'rescue' territory.

I'm not really sure about the difference between rescue and check.

Thanks,
Qu

> 
>> This repairing  is also included in --repair, but considering the safety
>> and potential affected users, it's better to provide a option to fix and
>> only fix device size related problem to avoid full --repair.
>>
>> Reported-by: Asif Youssuff <yoasif@gmail.com>
>> Reported-by: Rich Rauenzahn <rrauenza@gmail.com>
>> Signed-off-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
>> ---
>>   Documentation/btrfs-check.asciidoc | 23 +++++++++++++++++++++++
>>   cmds-check.c                       | 28 +++++++++++++++++++++++++++-
>>   2 files changed, 50 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/btrfs-check.asciidoc b/Documentation/btrfs-check.asciidoc
>> index fbf48847ac25..e45c7a457bac 100644
>> --- a/Documentation/btrfs-check.asciidoc
>> +++ b/Documentation/btrfs-check.asciidoc
>> @@ -93,6 +93,29 @@ the entire free space cache. This option with 'v2' provides an alternative
>>   method of clearing the free space cache that doesn't require mounting the
>>   filesystem.
>>   
>> +--fix-dev-size::
>> +From v4.14-rc kernels, a more restrict device size checker is introduced, while
>> +old kernel doesn't strictly align its device size, so it may cause noisy kernel
>> +warning for newer kernel, like:
>> ++
>> +....
>> +WARNING: CPU: 3 PID: 439 at fs/btrfs/ctree.h:1559 btrfs_update_device+0x1c5/0x1d0 [btrfs]
>> +....
>> ++
>> +And for some case where super block total device size may mismatch with all
>> +devices, and the filesystem will be unable to be mounted, with kernel message
>> +like:
>> ++
>> +....
>> +BTRFS error (device sdb): super_total_bytes 92017859088384 mismatch with fs_devices total_rw_bytes 92017859094528
>> +....
>> ++
>> +This option will fix both problems by aligning all size of devices, and
>> +re-calculating superblock total bytes.
>> ++
>> +Although such repairing is included in *--repair* option, considering the
>> +safety of *--repair*, this option is provided to suppress all other dangerous
>> +repairing and only fix device sizes related problems.
>>   
>>   DANGEROUS OPTIONS
>>   -----------------
>> diff --git a/cmds-check.c b/cmds-check.c
>> index 007781fa5d1b..fdb6d832eee1 100644
>> --- a/cmds-check.c
>> +++ b/cmds-check.c
>> @@ -11746,6 +11746,8 @@ out:
>>   	return err;
>>   }
>>   
>> +static int reset_devs_size(struct btrfs_fs_info *fs_info);
>> +
>>   static int do_check_chunks_and_extents(struct btrfs_fs_info *fs_info)
>>   {
>>   	int ret;
>> @@ -11756,6 +11758,12 @@ static int do_check_chunks_and_extents(struct btrfs_fs_info *fs_info)
>>   		ret = check_chunks_and_extents_v2(fs_info);
>>   	else
>>   		ret = check_chunks_and_extents(fs_info);
>> +	/* Also repair device sizes if needed */
>> +	if (repair && !ret) {
>> +		ret = reset_devs_size(fs_info);
>> +		if (ret > 0)
>> +			ret = 0;
>> +	}
>>   
>>   	return ret;
>>   }
>> @@ -13088,6 +13096,8 @@ const char * const cmd_check_usage[] = {
>>   	"-b|--backup                 use the first valid backup root copy",
>>   	"--force                     skip mount checks, repair is not possible",
>>   	"--repair                    try to repair the filesystem",
>> +	"--fix-dev-size              repair device size related problem",
>> +	"                            will not trigger other repair",
>>   	"--readonly                  run in read-only mode (default)",
>>   	"--init-csum-tree            create a new CRC tree",
>>   	"--init-extent-tree          create a new extent tree",
>> @@ -13128,6 +13138,7 @@ int cmd_check(int argc, char **argv)
>>   	int qgroups_repaired = 0;
>>   	unsigned ctree_flags = OPEN_CTREE_EXCLUSIVE;
>>   	int force = 0;
>> +	bool fix_dev_size = false;
>>   
>>   	while(1) {
>>   		int c;
>> @@ -13135,7 +13146,7 @@ int cmd_check(int argc, char **argv)
>>   			GETOPT_VAL_INIT_EXTENT, GETOPT_VAL_CHECK_CSUM,
>>   			GETOPT_VAL_READONLY, GETOPT_VAL_CHUNK_TREE,
>>   			GETOPT_VAL_MODE, GETOPT_VAL_CLEAR_SPACE_CACHE,
>> -			GETOPT_VAL_FORCE };
>> +			GETOPT_VAL_FORCE, GETOPT_VAL_FIX_DEV_SIZE };
>>   		static const struct option long_options[] = {
>>   			{ "super", required_argument, NULL, 's' },
>>   			{ "repair", no_argument, NULL, GETOPT_VAL_REPAIR },
>> @@ -13158,6 +13169,8 @@ int cmd_check(int argc, char **argv)
>>   			{ "clear-space-cache", required_argument, NULL,
>>   				GETOPT_VAL_CLEAR_SPACE_CACHE},
>>   			{ "force", no_argument, NULL, GETOPT_VAL_FORCE },
>> +			{ "fix-dev-size", no_argument, NULL,
>> +				GETOPT_VAL_FIX_DEV_SIZE },
>>   			{ NULL, 0, NULL, 0}
>>   		};
>>   
>> @@ -13245,6 +13258,11 @@ int cmd_check(int argc, char **argv)
>>   			case GETOPT_VAL_FORCE:
>>   				force = 1;
>>   				break;
>> +			case GETOPT_VAL_FIX_DEV_SIZE:
>> +				fix_dev_size = true;
>> +				repair = 1;
>> +				ctree_flags |= OPEN_CTREE_WRITES;
>> +				break;
>>   		}
>>   	}
>>   
>> @@ -13371,6 +13389,14 @@ int cmd_check(int argc, char **argv)
>>   			report_qgroups(1);
>>   		goto close_out;
>>   	}
>> +
>> +	if (fix_dev_size) {
>> +		ret = reset_devs_size(info);
>> +		if (ret > 0)
>> +			ret = 0;
>> +		err |= !!ret;
>> +		goto close_out;
>> +	}
>>   	if (subvolid) {
>>   		printf("Print extent state for subvolume %llu on %s\nUUID: %s\n",
>>   		       subvolid, argv[optind], uuidbuf);
>> -- 
>> 2.14.2
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/4] btrfs-progs: fsck: Introduce --fix-dev-size option
  2017-10-11  0:43     ` Qu Wenruo
@ 2017-10-26 18:58       ` David Sterba
  2017-10-27  0:50         ` Qu Wenruo
  0 siblings, 1 reply; 14+ messages in thread
From: David Sterba @ 2017-10-26 18:58 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: dsterba, linux-btrfs, yoasif, rrauenza

On Wed, Oct 11, 2017 at 08:43:24AM +0800, Qu Wenruo wrote:
> On 2017年10月10日 21:16, David Sterba wrote:
> > On Tue, Oct 10, 2017 at 07:51:11AM +0000, Qu Wenruo wrote:
> >> Introduce --fix-dev-size option to fix device related problems.
> > 
> > Please don't add it to 'check', this is not the right place for the
> > targeted fixes. -> 'btrfs rescue'
> 
> I'm OK moving the super total_bytes fix to 'btrfs rescue'.
> 
> But what about the alignment/mismatch detection part?
> Is it still OK to detect them in 'btrfs check'?
> 
> And further more, the unaligned device total_bytes problem is not a big 
> problem that fits into 'rescue' territory.
> 
> I'm not really sure about the difference between rescue and check.

Check is supposed to find the problems, and rescue command group is for
specific fixes that are not suitable for 'check'. This is to avoid too
many specific options for 'check' and all the possible combinations.

We'll fix the total_bytes bug and don't expect it to be a problem in the
future again, so we can forget about the subcommand in rescue.

What should check report if it detects this kind of inconsistencies,
that's a good question. It could either fix them automatically (if it's
safe) or point to the specific command.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/4] btrfs-progs: fsck: Introduce --fix-dev-size option
  2017-10-26 18:58       ` David Sterba
@ 2017-10-27  0:50         ` Qu Wenruo
  0 siblings, 0 replies; 14+ messages in thread
From: Qu Wenruo @ 2017-10-27  0:50 UTC (permalink / raw)
  To: dsterba, linux-btrfs, yoasif, rrauenza


[-- Attachment #1.1: Type: text/plain, Size: 1860 bytes --]



On 2017年10月27日 02:58, David Sterba wrote:
> On Wed, Oct 11, 2017 at 08:43:24AM +0800, Qu Wenruo wrote:
>> On 2017年10月10日 21:16, David Sterba wrote:
>>> On Tue, Oct 10, 2017 at 07:51:11AM +0000, Qu Wenruo wrote:
>>>> Introduce --fix-dev-size option to fix device related problems.
>>>
>>> Please don't add it to 'check', this is not the right place for the
>>> targeted fixes. -> 'btrfs rescue'
>>
>> I'm OK moving the super total_bytes fix to 'btrfs rescue'.
>>
>> But what about the alignment/mismatch detection part?
>> Is it still OK to detect them in 'btrfs check'?
>>
>> And further more, the unaligned device total_bytes problem is not a big 
>> problem that fits into 'rescue' territory.
>>
>> I'm not really sure about the difference between rescue and check.
> 
> Check is supposed to find the problems, and rescue command group is for
> specific fixes that are not suitable for 'check'. This is to avoid too
> many specific options for 'check' and all the possible combinations.
> 
> We'll fix the total_bytes bug and don't expect it to be a problem in the
> future again, so we can forget about the subcommand in rescue.
> 
> What should check report if it detects this kind of inconsistencies,
> that's a good question. It could either fix them automatically (if it's
> safe) or point to the specific command.

OK, then current behavior, which points to rescue, is good enough.

I don't think it's a good idea to fix it automatically especially when
the default behavior is --readonly.
But fixing it in --repair makes sense, and that's already done in the
patchset.

Thanks,
Qu

> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-10-27  0:50 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-10  7:51 [PATCH 0/4] btrfs-progs repair support for unaligned/mismatched device sizes Qu Wenruo
2017-10-10  7:51 ` [PATCH 1/4] btrfs-progs: Introduce functions to repair unaligned/mismatch device size Qu Wenruo
2017-10-10  8:24   ` Nikolay Borisov
2017-10-10  7:51 ` [PATCH 2/4] btrfs-progs: fsck: Introduce --fix-dev-size option Qu Wenruo
2017-10-10 13:16   ` David Sterba
2017-10-11  0:43     ` Qu Wenruo
2017-10-26 18:58       ` David Sterba
2017-10-27  0:50         ` Qu Wenruo
2017-10-10  7:51 ` [PATCH 3/4] btrfs-progs: check: Also check unalignment/mismatch device and super size Qu Wenruo
2017-10-10  8:31   ` Nikolay Borisov
2017-10-10  8:34     ` Qu Wenruo
2017-10-10  7:51 ` [PATCH 4/4] btrfs-progs: test/fsck: Add test case image for --fix-dev-size Qu Wenruo
2017-10-10  8:15 ` [PATCH 0/4] btrfs-progs repair support for unaligned/mismatched device sizes Nikolay Borisov
2017-10-10  8:31   ` Qu Wenruo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.