All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/6] Convert rollback rework for v4.9
@ 2016-12-19  6:56 Qu Wenruo
  2016-12-19  6:56 ` [PATCH v3 1/6] btrfs-progs: file-item: Fix wrong file extents inserted Qu Wenruo
                   ` (6 more replies)
  0 siblings, 7 replies; 18+ messages in thread
From: Qu Wenruo @ 2016-12-19  6:56 UTC (permalink / raw)
  To: linux-btrfs

Can be fetched from github:
https://github.com/adam900710/btrfs-progs.git convert_rework_for_4.9

This is mainly to fix problems exposed by Chandan's fix for 64K nodesize.

The problem is, although we're still using old rollback functions, it
has quite a lot of problems to support the new behavior.

1) Can't rollback new convert image with new data chunk
Chunk level check can't handle newly allocated data chunk, which is not
1:1 mapped but completely valid in new behavior.
The last patch will enhance the test case to handle it.

2) Can't rollback real no-hole image
Since it assumes hole file extent as requirement.
And due to the possibility to enable no_holes halfway, btrfsck won't
report such error, since it's acceptable.

3) Too complex logic, and RW btrfs tree operations
In fact, considering how small data we need to rewrite (1M + 128K), we
don't really need to open btrfs read-write.
Just copy needed data and re-fill. Simple and easy.

Thanks Chandan, his report on failure of rollback leads to this rework.

And this is still small fixes, most of the patch are just deleting old codes.
All rollback test cases from btrfs-progs and fstests are passed.

v2:
  Fix a bug that we can still rollback if convert subvolume is just orphaned,
  not deleted. Exposed by btrfs/012.

v3:
  Better patch split. Now only one patch is larger than 200 lines(214 lines)
  Fix regression introduced in last minute update of v2 patch.
  Remove duplicated bs= options.

Qu Wenruo (6):
  btrfs-progs: file-item: Fix wrong file extents inserted
  btrfs-progs: utils: Introduce basic set operations for range
  btrfs-progs: convert: Introduce function to record relocated ranges
  btrfs-progs: convert: Introduce new function to check if we can
    rollback
  btrfs-progs: convert: Switch to new rollback function
  btrfs-progs: convert-test: trigger chunk allocation after convert

 convert/main.c       | 744 ++++++++++++++++++++++-----------------------------
 disk-io.h            |   9 +-
 file-item.c          |  11 +
 tests/common         |   4 +
 tests/common.convert |   3 +
 utils.h              |  19 ++
 6 files changed, 366 insertions(+), 424 deletions(-)

-- 
2.10.2




^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v3 1/6] btrfs-progs: file-item: Fix wrong file extents inserted
  2016-12-19  6:56 [PATCH v3 0/6] Convert rollback rework for v4.9 Qu Wenruo
@ 2016-12-19  6:56 ` Qu Wenruo
  2016-12-19  6:56 ` [PATCH v3 2/6] btrfs-progs: utils: Introduce basic set operations for range Qu Wenruo
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Qu Wenruo @ 2016-12-19  6:56 UTC (permalink / raw)
  To: linux-btrfs

If we specify NO_HOLES incompat feature when converting, the result
image still uses hole file extents.
And further more, the hole is incorrect as its disk_num_bytes is not
zero.

The problem is at btrfs_insert_file_extent() which doesn't check if we
are going to insert hole file extent.

Modify it to skip hole file extents to allow it follow restrict NO_HOLES
flag.

And since no_holes flag can be triggered on half-way, so current fsck
won't report such error, as it consider it as old file holes.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 convert/main.c |  2 +-
 file-item.c    | 11 +++++++++++
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/convert/main.c b/convert/main.c
index fd6f77b..15f14af 100644
--- a/convert/main.c
+++ b/convert/main.c
@@ -336,7 +336,7 @@ static int record_file_blocks(struct blk_iterate_data *data,
 		       key.offset > cur_off);
 		fi = btrfs_item_ptr(node, slot, struct btrfs_file_extent_item);
 		extent_disk_bytenr = btrfs_file_extent_disk_bytenr(node, fi);
-		extent_num_bytes = btrfs_file_extent_disk_num_bytes(node, fi);
+		extent_num_bytes = btrfs_file_extent_num_bytes(node, fi);
 		BUG_ON(cur_off - key.offset >= extent_num_bytes);
 		btrfs_release_path(&path);
 
diff --git a/file-item.c b/file-item.c
index 67c0b4f..e462b4b 100644
--- a/file-item.c
+++ b/file-item.c
@@ -36,11 +36,22 @@ int btrfs_insert_file_extent(struct btrfs_trans_handle *trans,
 			     u64 disk_num_bytes, u64 num_bytes)
 {
 	int ret = 0;
+	int is_hole = 0;
 	struct btrfs_file_extent_item *item;
 	struct btrfs_key file_key;
 	struct btrfs_path *path;
 	struct extent_buffer *leaf;
 
+	if (offset == 0)
+		is_hole = 1;
+	/* For NO_HOLES, we don't insert hole file extent */
+	if (btrfs_fs_incompat(root->fs_info, NO_HOLES) && is_hole)
+		return 0;
+
+	/* For hole, its disk_bytenr and disk_num_bytes must be 0 */
+	if (is_hole)
+		disk_num_bytes = 0;
+
 	path = btrfs_alloc_path();
 	if (!path)
 		return -ENOMEM;
-- 
2.10.2




^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 2/6] btrfs-progs: utils: Introduce basic set operations for range
  2016-12-19  6:56 [PATCH v3 0/6] Convert rollback rework for v4.9 Qu Wenruo
  2016-12-19  6:56 ` [PATCH v3 1/6] btrfs-progs: file-item: Fix wrong file extents inserted Qu Wenruo
@ 2016-12-19  6:56 ` Qu Wenruo
  2017-01-23 17:28   ` David Sterba
  2017-01-23 17:40   ` David Sterba
  2016-12-19  6:56 ` [PATCH v3 3/6] btrfs-progs: convert: Introduce function to record relocated ranges Qu Wenruo
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 18+ messages in thread
From: Qu Wenruo @ 2016-12-19  6:56 UTC (permalink / raw)
  To: linux-btrfs

Introduce basic set operations: is_subset() and is_intersection().

This is quite useful to check if a range [start, start + len) subset or
intersection of another range.
So we don't need to use open code to do it, which I sometimes do it
wrong.

Also use these new facilities in btrfs-convert, to check if a range is a
subset or intersects with btrfs convert reserved ranges.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 convert/main.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++
 disk-io.h      |  9 +++++++--
 utils.h        | 19 +++++++++++++++++++
 3 files changed, 77 insertions(+), 2 deletions(-)

diff --git a/convert/main.c b/convert/main.c
index 15f14af..db6d371 100644
--- a/convert/main.c
+++ b/convert/main.c
@@ -61,6 +61,15 @@
 
 #define CONV_IMAGE_SUBVOL_OBJECTID BTRFS_FIRST_FREE_OBJECTID
 
+/*
+ * Btrfs reserved ranges.
+ * In these ranges, btrfs record superblocks, so old fs data in these
+ * range can be relocated to other physical location
+ */
+static u64 reserved_range_starts[3] = { 0, BTRFS_SB_MIRROR_OFFSET(1),
+					BTRFS_SB_MIRROR_OFFSET(2) };
+static u64 reserved_range_lens[3] = { 1024 * 1024, 64 * 1024, 64 * 1024 };
+
 struct task_ctx {
 	uint32_t max_copy_inodes;
 	uint32_t cur_copy_inodes;
@@ -2672,6 +2681,48 @@ fail:
 	return -1;
 }
 
+/*
+ * Check if [start, start + len) is a subset of btrfs reserved ranges
+ */
+static int is_range_subset_of_reserved_ranges(u64 start, u64 len)
+{
+	int i;
+	int ret = 0;
+
+	for (i = 0; i < ARRAY_SIZE(reserved_range_starts); i++) {
+		if (is_range_subset(start, len, reserved_range_starts[i],
+				    reserved_range_lens[i])) {
+			ret = 1;
+			break;
+		}
+	}
+	return ret;
+}
+
+/*
+ * Check if [start, start + len) intersects with btrfs reserved ranges
+ * if intersects, record the first range it intersects with to @ret_index
+ */
+static int is_range_intersection_of_reserved_ranges(u64 start, u64 len,
+						    int *ret_index)
+{
+	int nr = -1;
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(reserved_range_starts); i++) {
+		if (is_range_intersect(start, len, reserved_range_starts[i],
+				       reserved_range_lens[i])) {
+			nr = i;
+			break;
+		}
+	}
+	if (nr == -1)
+		return 0;
+	if (ret_index)
+		*ret_index = nr;
+	return 1;
+}
+
 static int do_rollback(const char *devname)
 {
 	int fd = -1;
diff --git a/disk-io.h b/disk-io.h
index 1c8387e..af6fcfd 100644
--- a/disk-io.h
+++ b/disk-io.h
@@ -97,11 +97,16 @@ enum btrfs_read_sb_flags {
 	SBREAD_PARTIAL		= (1 << 1),
 };
 
+/*
+ * Use macro to define mirror super block position
+ * So we can use it in static array initializtion
+ */
+#define BTRFS_SB_MIRROR_OFFSET(mirror)	((u64)(16 * 1024) << \
+					 (BTRFS_SUPER_MIRROR_SHIFT * (mirror)))
 static inline u64 btrfs_sb_offset(int mirror)
 {
-	u64 start = 16 * 1024;
 	if (mirror)
-		return start << (BTRFS_SUPER_MIRROR_SHIFT * mirror);
+		return BTRFS_SB_MIRROR_OFFSET(mirror);
 	return BTRFS_SUPER_INFO_OFFSET;
 }
 
diff --git a/utils.h b/utils.h
index 366ca29..39ca970 100644
--- a/utils.h
+++ b/utils.h
@@ -457,4 +457,23 @@ unsigned int rand_range(unsigned int upper);
 /* Also allow setting the seed manually */
 void init_rand_seed(u64 seed);
 
+/*
+ * Basic set operations
+ * Mainly used for ranges subset/intersect
+ */
+/* Check if [start1, start1 + len1) is subset of [start2, start2 + len2) */
+static inline int is_range_subset(u64 start1, u64 len1, u64 start2, u64 len2)
+{
+	if (start1 >= start2 && start1 + len1 <= start2 + len2)
+		return 1;
+	return 0;
+}
+
+/* Check if [start1, start1 + len1) intersects with [start2, start2 + len2) */
+static inline int is_range_intersect(u64 start1, u64 len1, u64 start2, u64 len2)
+{
+	if (start1 >= start2 + len2 || start1 + len1 <= start2)
+		return 0;
+	return 1;
+}
 #endif
-- 
2.10.2




^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 3/6] btrfs-progs: convert: Introduce function to record relocated ranges
  2016-12-19  6:56 [PATCH v3 0/6] Convert rollback rework for v4.9 Qu Wenruo
  2016-12-19  6:56 ` [PATCH v3 1/6] btrfs-progs: file-item: Fix wrong file extents inserted Qu Wenruo
  2016-12-19  6:56 ` [PATCH v3 2/6] btrfs-progs: utils: Introduce basic set operations for range Qu Wenruo
@ 2016-12-19  6:56 ` Qu Wenruo
  2016-12-19  6:56 ` [PATCH v3 4/6] btrfs-progs: convert: Introduce new function to check if we can rollback Qu Wenruo
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Qu Wenruo @ 2016-12-19  6:56 UTC (permalink / raw)
  To: linux-btrfs

Introduce a new function, record_reloc_data(), to read out any data that
is a subset of reserved btrfs ranges.

This function is mainly a compatible function for old convert behavior.
As old convert behavior only reloc super blocks (4K size), while the new
convert behavior will reloc 64K block.

So this function will truncate and map the file extents of the converted
image, and read out corresponding data for later rollback usage.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 convert/main.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 58 insertions(+)

diff --git a/convert/main.c b/convert/main.c
index db6d371..87c52c1 100644
--- a/convert/main.c
+++ b/convert/main.c
@@ -2723,6 +2723,64 @@ static int is_range_intersection_of_reserved_ranges(u64 start, u64 len,
 	return 1;
 }
 
+/*
+ * Read out data in reserved_ranges and write them into @reloc_ranges
+ *
+ * This is mainly for old convert behavior.
+ * Which only relocs the super block (4K), while we now reloc 64K.
+ *
+ * So there is some range in reserved range but still mapped 1:1.
+ * We use this function to read them into @reloc_ranges
+ */
+static int record_reloc_data(struct btrfs_fs_info *fs_info,
+			     u64 old_offset, u64 disk_bytenr, u64 ram_len,
+			     char *reloc_ranges[3])
+{
+	/* Which reloc range we interests with, physical bytenr */
+	u64 reloc_start;
+	u64 reloc_len;
+
+	/*
+	 * File offset, which get truncated to above range,
+	 * Offset in image file
+	 */
+	u64 file_start;
+	u64 file_len;
+
+	/* Btrfs logical address, used for final reading*/
+	u64 logical_start;
+	u64 logical_len;
+	char *dest;
+	int nr;
+	int ret;
+
+	/* Check if the range intersects */
+	if (!is_range_intersection_of_reserved_ranges(old_offset, ram_len,
+						      &nr))
+		return 0;
+	reloc_start = reserved_range_starts[nr];
+	reloc_len = reserved_range_lens[nr];
+
+	/* Truncate the range to reserved ranges */
+	file_start = max(old_offset, reloc_start);
+	file_len = min(old_offset + ram_len, reloc_start + reloc_len) -
+			file_start;
+
+	/* Get btrfs logical address */
+	logical_start = file_start - old_offset + disk_bytenr;
+	logical_len = file_len;
+
+	dest = reloc_ranges[nr] + file_start - reloc_start;
+	ret = read_extent_data(fs_info->tree_root, dest, logical_start,
+				&logical_len, 0);
+	if (ret < 0)
+		return ret;
+	/* Short read, something went wrong */
+	if (logical_len != file_len)
+		return -EIO;
+	return 0;
+}
+
 static int do_rollback(const char *devname)
 {
 	int fd = -1;
-- 
2.10.2




^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 4/6] btrfs-progs: convert: Introduce new function to check if we can rollback
  2016-12-19  6:56 [PATCH v3 0/6] Convert rollback rework for v4.9 Qu Wenruo
                   ` (2 preceding siblings ...)
  2016-12-19  6:56 ` [PATCH v3 3/6] btrfs-progs: convert: Introduce function to record relocated ranges Qu Wenruo
@ 2016-12-19  6:56 ` Qu Wenruo
  2016-12-19  6:56 ` [PATCH v3 5/6] btrfs-progs: convert: Switch to new rollback function Qu Wenruo
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Qu Wenruo @ 2016-12-19  6:56 UTC (permalink / raw)
  To: linux-btrfs

Introduce new function, check_rollback(), to collect data and check if
we can rollback the image.

The check part is quite straight forward:
Ensure all the file extents, except the ones inside reserved ranges, are
mapped 1:1 on disk.

The ones inside reserved ranges, 0~1M, 1st sb +64K, 2nd sb +64K, can be
mapped to anywhere, as btrfs needs to put super blocks in them.

Such behavir can meet both old convert(one large chunk) and new
convert(only old image file is 1:1 mapped).

Also, the function will read out the data in btrfs reserved ranges for
later rollback usage.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 convert/main.c | 214 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 214 insertions(+)

diff --git a/convert/main.c b/convert/main.c
index 87c52c1..5b9141b 100644
--- a/convert/main.c
+++ b/convert/main.c
@@ -2781,6 +2781,220 @@ static int record_reloc_data(struct btrfs_fs_info *fs_info,
 	return 0;
 }
 
+static int check_image_file_extents(struct btrfs_root *image_root, u64 ino,
+				    u64 total_size, char *reloc_ranges[3])
+{
+	struct btrfs_key key;
+	struct btrfs_path path;
+	struct btrfs_fs_info *fs_info = image_root->fs_info;
+	u64 checked_bytes = 0;
+	int ret;
+
+	key.objectid = ino;
+	key.offset = 0;
+	key.type = BTRFS_EXTENT_DATA_KEY;
+
+	btrfs_init_path(&path);
+	ret = btrfs_search_slot(NULL, image_root, &key, &path, 0, 0);
+	/*
+	 * It's possible that some fs doesn't store any(including sb)
+	 * data into 0~1M range, and NO_HOLES is enabled.
+	 *
+	 * So only needs to check ret < 0 case
+	 */
+	if (ret < 0) {
+		error("failed to iterate file extents at offset 0: %s",
+			strerror(-ret));
+		btrfs_release_path(&path);
+		return ret;
+	}
+
+	/* Loop from the first file extents */
+	while (1) {
+		struct btrfs_file_extent_item *fi;
+		struct extent_buffer *leaf = path.nodes[0];
+		u64 disk_bytenr;
+		u64 file_offset;
+		u64 ram_bytes;
+		u64 extent_offset;
+		int slot = path.slots[0];
+
+		if (slot >= btrfs_header_nritems(leaf))
+			goto next;
+		btrfs_item_key_to_cpu(leaf, &key, slot);
+
+		/*
+		 * Iteration is done, exit normally, we have extra check out of
+		 * the loop
+		 */
+		if (key.objectid != ino || key.type != BTRFS_EXTENT_DATA_KEY) {
+			ret = 0;
+			break;
+		}
+		file_offset = key.offset;
+		fi = btrfs_item_ptr(leaf, slot, struct btrfs_file_extent_item);
+		if (btrfs_file_extent_type(leaf, fi) != BTRFS_FILE_EXTENT_REG) {
+			ret = -EINVAL;
+			error(
+		"ino %llu offset %llu doesn't have a regular file extent",
+				ino, file_offset);
+			break;
+		}
+		if (btrfs_file_extent_compression(leaf, fi) ||
+		    btrfs_file_extent_encryption(leaf, fi) ||
+		    btrfs_file_extent_other_encoding(leaf, fi)) {
+			ret = -EINVAL;
+			error(
+			"ino %llu offset %llu doesn't have a plain file extent",
+				ino, file_offset);
+			break;
+		}
+
+		disk_bytenr = btrfs_file_extent_disk_bytenr(leaf, fi);
+		ram_bytes = btrfs_file_extent_ram_bytes(leaf, fi);
+		extent_offset = btrfs_file_extent_offset(leaf, fi);
+
+		checked_bytes += ram_bytes;
+		/* Skip hole */
+		if (disk_bytenr == 0)
+			goto next;
+
+		if (file_offset != disk_bytenr) {
+			/*
+			 * Only file extent in btrfs reserved ranges are allow
+			 * non-1:1 mapped
+			 */
+			if (!is_range_subset_of_reserved_ranges(file_offset,
+							ram_bytes)) {
+				ret = -EINVAL;
+				error(
+		"ino %llu offset %llu file extent should not be relocated",
+					ino, file_offset);
+				break;
+			}
+		}
+		ret = record_reloc_data(fs_info, file_offset,
+				disk_bytenr + extent_offset, ram_bytes,
+				reloc_ranges);
+		if (ret < 0) {
+			error("ino %llu offset %llu failed to read extent data",
+				ino, file_offset);
+			break;
+		}
+next:
+		ret = btrfs_next_item(image_root, &path);
+		if (ret) {
+			if (ret > 0)
+				ret = 0;
+			break;
+		}
+	}
+	btrfs_release_path(&path);
+	/*
+	 * For HOLES mode (without NO_HOLES), we must ensure file extents
+	 * cover the whole range of the image
+	 */
+	if (!ret && !btrfs_fs_incompat(fs_info, NO_HOLES)) {
+		if (checked_bytes != total_size) {
+			ret = -EINVAL;
+			error("inode %llu has some file extents not checked",
+				ino);
+		}
+	}
+	return ret;
+}
+
+/*
+ * Check and record needed blocks for rollback.
+ *
+ * It will record data for superblock and reserved ranges to reloc_ranges[].
+ * So we can rollback the fs after close_ctree().
+ */
+static int check_rollback(struct btrfs_fs_info *fs_info, char *reloc_ranges[3])
+{
+	struct btrfs_root *image_root;
+	struct btrfs_dir_item *dir;
+	struct btrfs_path path;
+	struct btrfs_key key;
+	struct btrfs_inode_item *inode_item;
+	char *image_name = "image";
+	u64 ino;
+	u64 root_dir;
+	u64 total_bytes;
+	int ret;
+
+	btrfs_init_path(&path);
+
+	/*
+	 * Search for root backref, or after subvolume delete(orphan),
+	 * we can still rollback if the subvolume is just orphan.
+	 */
+	key.objectid = CONV_IMAGE_SUBVOL_OBJECTID;
+	key.type = BTRFS_ROOT_BACKREF_KEY;
+	key.offset = BTRFS_FS_TREE_OBJECTID;
+
+	ret = btrfs_search_slot(NULL, fs_info->tree_root, &key, &path, 0, 0);
+	btrfs_release_path(&path);
+	if (ret > 0) {
+		error("unable to convert ext2 image subvolume, is it deleted?");
+		return -ENOENT;
+	} else if (ret < 0) {
+		error("failed to find ext2 image subvolume: %s",
+			strerror(-ret));
+		return ret;
+	}
+
+	/* Search convert subvolume */
+	key.objectid = CONV_IMAGE_SUBVOL_OBJECTID;
+	key.type = BTRFS_ROOT_ITEM_KEY;
+	key.offset = (u64)-1;
+
+	image_root = btrfs_read_fs_root(fs_info, &key);
+	if (IS_ERR(image_root)) {
+		ret = PTR_ERR(image_root);
+		error("failed to open convert image subvolume: %s",
+			strerror(-ret));
+		return ret;
+	}
+
+	/* Search the image file */
+	root_dir = btrfs_root_dirid(&image_root->root_item);
+	dir = btrfs_lookup_dir_item(NULL, image_root, &path, root_dir,
+			image_name, strlen(image_name), 0);
+
+	if (!dir || IS_ERR(dir)) {
+		btrfs_release_path(&path);
+		if (dir)
+			ret = PTR_ERR(dir);
+		else
+			ret = -ENOENT;
+		error("failed to locate file %s: %s", image_name,
+			strerror(-ret));
+		return ret;
+	}
+	btrfs_dir_item_key_to_cpu(path.nodes[0], dir, &key);
+	btrfs_release_path(&path);
+
+	/* Get total size of the original image */
+	ino = key.objectid;
+
+	ret = btrfs_lookup_inode(NULL, image_root, &path, &key, 0);
+	if (ret < 0) {
+		btrfs_release_path(&path);
+		error("unable to find inode %llu: %s", ino, strerror(-ret));
+		return ret;
+	}
+	inode_item = btrfs_item_ptr(path.nodes[0], path.slots[0],
+				    struct btrfs_inode_item);
+	total_bytes = btrfs_inode_size(path.nodes[0], inode_item);
+	btrfs_release_path(&path);
+
+	/* Main function to check every file extent of the image file */
+	ret = check_image_file_extents(image_root, ino, total_bytes,
+					reloc_ranges);
+	return ret;
+}
+
 static int do_rollback(const char *devname)
 {
 	int fd = -1;
-- 
2.10.2




^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 5/6] btrfs-progs: convert: Switch to new rollback function
  2016-12-19  6:56 [PATCH v3 0/6] Convert rollback rework for v4.9 Qu Wenruo
                   ` (3 preceding siblings ...)
  2016-12-19  6:56 ` [PATCH v3 4/6] btrfs-progs: convert: Introduce new function to check if we can rollback Qu Wenruo
@ 2016-12-19  6:56 ` Qu Wenruo
  2016-12-20  5:36   ` Chandan Rajendra
  2017-01-23 17:54   ` David Sterba
  2016-12-19  6:56 ` [PATCH v3 6/6] btrfs-progs: convert-test: trigger chunk allocation after convert Qu Wenruo
  2016-12-21 14:35 ` [PATCH v3 0/6] Convert rollback rework for v4.9 David Sterba
  6 siblings, 2 replies; 18+ messages in thread
From: Qu Wenruo @ 2016-12-19  6:56 UTC (permalink / raw)
  To: linux-btrfs

Since we have the whole facilities needed to rollback, switch to the new
rollback.

The new rollback function can handle the following things that old
rollback either can't handle or just refuse to rollback:

1) New converted btrfs which allocates new data chunk
   This is due to the too restrict may_roll_back() condition, which is
   never a good friend for new convert behavior.

   The new rollback behavior fixes it by not checking data chunks, but
   only to check the file extents of the convert image file.

   If all file extents except the ones in reserved ranges, then we allow
   rollback.

2) New converted btrfs which enabled NO_HOLES feature
   Thanks to previous patches, we can convert to real NO_HOLES btrfs.

   And since old rollback assumes that file extents and holes covers the
   whole image file, it will fail due to the non-exists holes.

   Fix it by iterating file extents of convert image, and only compare
   the size we checked against file size if NO_HOLES is not enabled.

And makes rollback function simpler:

1) Read-n-write vs extra chunk tree relocation
   Since converted btrfs only has 3 ranges that are not 1:1 mapped, just
   read this data out, and close btrfs, finally write them into position
   will be good enough, for both new convert and old convert.
   (To be more specific, old convert is just a subset of more universal
    new convert behavior)

   No extra work is needed any more, and we can even open the btrfs RO.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 convert/main.c | 527 ++++++---------------------------------------------------
 1 file changed, 52 insertions(+), 475 deletions(-)

diff --git a/convert/main.c b/convert/main.c
index 5b9141b..3ff864b 100644
--- a/convert/main.c
+++ b/convert/main.c
@@ -1421,36 +1421,6 @@ fail:
 	return ret;
 }
 
-static int prepare_system_chunk_sb(struct btrfs_super_block *super)
-{
-	struct btrfs_chunk *chunk;
-	struct btrfs_disk_key *key;
-	u32 sectorsize = btrfs_super_sectorsize(super);
-
-	key = (struct btrfs_disk_key *)(super->sys_chunk_array);
-	chunk = (struct btrfs_chunk *)(super->sys_chunk_array +
-				       sizeof(struct btrfs_disk_key));
-
-	btrfs_set_disk_key_objectid(key, BTRFS_FIRST_CHUNK_TREE_OBJECTID);
-	btrfs_set_disk_key_type(key, BTRFS_CHUNK_ITEM_KEY);
-	btrfs_set_disk_key_offset(key, 0);
-
-	btrfs_set_stack_chunk_length(chunk, btrfs_super_total_bytes(super));
-	btrfs_set_stack_chunk_owner(chunk, BTRFS_EXTENT_TREE_OBJECTID);
-	btrfs_set_stack_chunk_stripe_len(chunk, BTRFS_STRIPE_LEN);
-	btrfs_set_stack_chunk_type(chunk, BTRFS_BLOCK_GROUP_SYSTEM);
-	btrfs_set_stack_chunk_io_align(chunk, sectorsize);
-	btrfs_set_stack_chunk_io_width(chunk, sectorsize);
-	btrfs_set_stack_chunk_sector_size(chunk, sectorsize);
-	btrfs_set_stack_chunk_num_stripes(chunk, 1);
-	btrfs_set_stack_chunk_sub_stripes(chunk, 0);
-	chunk->stripe.devid = super->dev_item.devid;
-	btrfs_set_stack_stripe_offset(&chunk->stripe, 0);
-	memcpy(chunk->stripe.dev_uuid, super->dev_item.uuid, BTRFS_UUID_SIZE);
-	btrfs_set_super_sys_array_size(super, sizeof(*key) + sizeof(*chunk));
-	return 0;
-}
-
 #if BTRFSCONVERT_EXT2
 
 /*
@@ -2557,131 +2527,6 @@ fail:
 }
 
 /*
- * Check if a non 1:1 mapped chunk can be rolled back.
- * For new convert, it's OK while for old convert it's not.
- */
-static int may_rollback_chunk(struct btrfs_fs_info *fs_info, u64 bytenr)
-{
-	struct btrfs_block_group_cache *bg;
-	struct btrfs_key key;
-	struct btrfs_path path;
-	struct btrfs_root *extent_root = fs_info->extent_root;
-	u64 bg_start;
-	u64 bg_end;
-	int ret;
-
-	bg = btrfs_lookup_first_block_group(fs_info, bytenr);
-	if (!bg)
-		return -ENOENT;
-	bg_start = bg->key.objectid;
-	bg_end = bg->key.objectid + bg->key.offset;
-
-	key.objectid = bg_end;
-	key.type = BTRFS_METADATA_ITEM_KEY;
-	key.offset = 0;
-	btrfs_init_path(&path);
-
-	ret = btrfs_search_slot(NULL, extent_root, &key, &path, 0, 0);
-	if (ret < 0)
-		return ret;
-
-	while (1) {
-		struct btrfs_extent_item *ei;
-
-		ret = btrfs_previous_extent_item(extent_root, &path, bg_start);
-		if (ret > 0) {
-			ret = 0;
-			break;
-		}
-		if (ret < 0)
-			break;
-
-		btrfs_item_key_to_cpu(path.nodes[0], &key, path.slots[0]);
-		if (key.type == BTRFS_METADATA_ITEM_KEY)
-			continue;
-		/* Now it's EXTENT_ITEM_KEY only */
-		ei = btrfs_item_ptr(path.nodes[0], path.slots[0],
-				    struct btrfs_extent_item);
-		/*
-		 * Found data extent, means this is old convert must follow 1:1
-		 * mapping.
-		 */
-		if (btrfs_extent_flags(path.nodes[0], ei)
-				& BTRFS_EXTENT_FLAG_DATA) {
-			ret = -EINVAL;
-			break;
-		}
-	}
-	btrfs_release_path(&path);
-	return ret;
-}
-
-static int may_rollback(struct btrfs_root *root)
-{
-	struct btrfs_fs_info *info = root->fs_info;
-	struct btrfs_multi_bio *multi = NULL;
-	u64 bytenr;
-	u64 length;
-	u64 physical;
-	u64 total_bytes;
-	int num_stripes;
-	int ret;
-
-	if (btrfs_super_num_devices(info->super_copy) != 1)
-		goto fail;
-
-	bytenr = BTRFS_SUPER_INFO_OFFSET;
-	total_bytes = btrfs_super_total_bytes(root->fs_info->super_copy);
-
-	while (1) {
-		ret = btrfs_map_block(&info->mapping_tree, WRITE, bytenr,
-				      &length, &multi, 0, NULL);
-		if (ret) {
-			if (ret == -ENOENT) {
-				/* removed block group at the tail */
-				if (length == (u64)-1)
-					break;
-
-				/* removed block group in the middle */
-				goto next;
-			}
-			goto fail;
-		}
-
-		num_stripes = multi->num_stripes;
-		physical = multi->stripes[0].physical;
-		free(multi);
-
-		if (num_stripes != 1) {
-			error("num stripes for bytenr %llu is not 1", bytenr);
-			goto fail;
-		}
-
-		/*
-		 * Extra check for new convert, as metadata chunk from new
-		 * convert is much more free than old convert, it doesn't need
-		 * to do 1:1 mapping.
-		 */
-		if (physical != bytenr) {
-			/*
-			 * Check if it's a metadata chunk and has only metadata
-			 * extent.
-			 */
-			ret = may_rollback_chunk(info, bytenr);
-			if (ret < 0)
-				goto fail;
-		}
-next:
-		bytenr += length;
-		if (bytenr >= total_bytes)
-			break;
-	}
-	return 0;
-fail:
-	return -1;
-}
-
-/*
  * Check if [start, start + len) is a subset of btrfs reserved ranges
  */
 static int is_range_subset_of_reserved_ranges(u64 start, u64 len)
@@ -2998,352 +2843,84 @@ static int check_rollback(struct btrfs_fs_info *fs_info, char *reloc_ranges[3])
 static int do_rollback(const char *devname)
 {
 	int fd = -1;
+	struct btrfs_root *root;
+	struct btrfs_fs_info *fs_info;
+	char *reloc_ranges[3] = { NULL };
+	off_t fsize;
 	int ret;
 	int i;
-	struct btrfs_root *root;
-	struct btrfs_root *image_root;
-	struct btrfs_root *chunk_root;
-	struct btrfs_dir_item *dir;
-	struct btrfs_inode_item *inode;
-	struct btrfs_file_extent_item *fi;
-	struct btrfs_trans_handle *trans;
-	struct extent_buffer *leaf;
-	struct btrfs_block_group_cache *cache1;
-	struct btrfs_block_group_cache *cache2;
-	struct btrfs_key key;
-	struct btrfs_path path;
-	struct extent_io_tree io_tree;
-	char *buf = NULL;
-	char *name;
-	u64 bytenr;
-	u64 num_bytes;
-	u64 root_dir;
-	u64 objectid;
-	u64 offset;
-	u64 start;
-	u64 end;
-	u64 sb_bytenr;
-	u64 first_free;
-	u64 total_bytes;
-	u32 sectorsize;
 
-	extent_io_tree_init(&io_tree);
+	for (i = 0; i < ARRAY_SIZE(reserved_range_starts); i++) {
+		reloc_ranges[i] = calloc(1, reserved_range_lens[i]);
+		if (!reloc_ranges[i]) {
+			ret = -ENOMEM;
+			goto free_mem;
+		}
+	}
 
 	fd = open(devname, O_RDWR);
 	if (fd < 0) {
 		error("unable to open %s: %s", devname, strerror(errno));
+		ret = -errno;
 		goto fail;
 	}
-	root = open_ctree_fd(fd, devname, 0, OPEN_CTREE_WRITES);
+
+	fsize = lseek(fd, 0, SEEK_END);
+	root = open_ctree_fd(fd, devname, 0, 0);
 	if (!root) {
 		error("unable to open ctree");
+		ret = -EIO;
 		goto fail;
 	}
-	ret = may_rollback(root);
-	if (ret < 0) {
-		error("unable to do rollback: %d", ret);
-		goto fail;
-	}
-
-	sectorsize = root->sectorsize;
-	buf = malloc(sectorsize);
-	if (!buf) {
-		error("unable to allocate memory");
-		goto fail;
-	}
-
-	btrfs_init_path(&path);
-
-	key.objectid = CONV_IMAGE_SUBVOL_OBJECTID;
-	key.type = BTRFS_ROOT_BACKREF_KEY;
-	key.offset = BTRFS_FS_TREE_OBJECTID;
-	ret = btrfs_search_slot(NULL, root->fs_info->tree_root, &key, &path, 0,
-				0);
-	btrfs_release_path(&path);
-	if (ret > 0) {
-		error("unable to convert ext2 image subvolume, is it deleted?");
-		goto fail;
-	} else if (ret < 0) {
-		error("unable to open ext2_saved, id %llu: %s",
-			(unsigned long long)key.objectid, strerror(-ret));
-		goto fail;
-	}
-
-	key.objectid = CONV_IMAGE_SUBVOL_OBJECTID;
-	key.type = BTRFS_ROOT_ITEM_KEY;
-	key.offset = (u64)-1;
-	image_root = btrfs_read_fs_root(root->fs_info, &key);
-	if (!image_root || IS_ERR(image_root)) {
-		error("unable to open subvolume %llu: %ld",
-			(unsigned long long)key.objectid, PTR_ERR(image_root));
-		goto fail;
-	}
-
-	name = "image";
-	root_dir = btrfs_root_dirid(&root->root_item);
-	dir = btrfs_lookup_dir_item(NULL, image_root, &path,
-				   root_dir, name, strlen(name), 0);
-	if (!dir || IS_ERR(dir)) {
-		error("unable to find file %s: %ld", name, PTR_ERR(dir));
-		goto fail;
-	}
-	leaf = path.nodes[0];
-	btrfs_dir_item_key_to_cpu(leaf, dir, &key);
-	btrfs_release_path(&path);
-
-	objectid = key.objectid;
-
-	ret = btrfs_lookup_inode(NULL, image_root, &path, &key, 0);
-	if (ret) {
-		error("unable to find inode item: %d", ret);
-		goto fail;
-	}
-	leaf = path.nodes[0];
-	inode = btrfs_item_ptr(leaf, path.slots[0], struct btrfs_inode_item);
-	total_bytes = btrfs_inode_size(leaf, inode);
-	btrfs_release_path(&path);
-
-	key.objectid = objectid;
-	key.offset = 0;
-	key.type = BTRFS_EXTENT_DATA_KEY;
-	ret = btrfs_search_slot(NULL, image_root, &key, &path, 0, 0);
-	if (ret != 0) {
-		error("unable to find first file extent: %d", ret);
-		btrfs_release_path(&path);
-		goto fail;
-	}
-
-	/* build mapping tree for the relocated blocks */
-	for (offset = 0; offset < total_bytes; ) {
-		leaf = path.nodes[0];
-		if (path.slots[0] >= btrfs_header_nritems(leaf)) {
-			ret = btrfs_next_leaf(root, &path);
-			if (ret != 0)
-				break;	
-			continue;
-		}
-
-		btrfs_item_key_to_cpu(leaf, &key, path.slots[0]);
-		if (key.objectid != objectid || key.offset != offset ||
-		    key.type != BTRFS_EXTENT_DATA_KEY)
-			break;
-
-		fi = btrfs_item_ptr(leaf, path.slots[0],
-				    struct btrfs_file_extent_item);
-		if (btrfs_file_extent_type(leaf, fi) != BTRFS_FILE_EXTENT_REG)
-			break;
-		if (btrfs_file_extent_compression(leaf, fi) ||
-		    btrfs_file_extent_encryption(leaf, fi) ||
-		    btrfs_file_extent_other_encoding(leaf, fi))
-			break;
+	fs_info = root->fs_info;
 
-		bytenr = btrfs_file_extent_disk_bytenr(leaf, fi);
-		/* skip holes and direct mapped extents */
-		if (bytenr == 0 || bytenr == offset)
-			goto next_extent;
-
-		bytenr += btrfs_file_extent_offset(leaf, fi);
-		num_bytes = btrfs_file_extent_num_bytes(leaf, fi);
-
-		cache1 = btrfs_lookup_block_group(root->fs_info, offset);
-		cache2 = btrfs_lookup_block_group(root->fs_info,
-						  offset + num_bytes - 1);
-		/*
-		 * Here we must take consideration of old and new convert
-		 * behavior.
-		 * For old convert case, sign, there is no consist chunk type
-		 * that will cover the extent. META/DATA/SYS are all possible.
-		 * Just ensure relocate one is in SYS chunk.
-		 * For new convert case, they are all covered by DATA chunk.
-		 *
-		 * So, there is not valid chunk type check for it now.
-		 */
-		if (cache1 != cache2)
-			break;
-
-		set_extent_bits(&io_tree, offset, offset + num_bytes - 1,
-				EXTENT_LOCKED, GFP_NOFS);
-		set_state_private(&io_tree, offset, bytenr);
-next_extent:
-		offset += btrfs_file_extent_num_bytes(leaf, fi);
-		path.slots[0]++;
-	}
-	btrfs_release_path(&path);
-
-	if (offset < total_bytes) {
-		error("unable to build extent mapping (offset %llu, total_bytes %llu)",
-				(unsigned long long)offset,
-				(unsigned long long)total_bytes);
-		error("converted filesystem after balance is unable to rollback");
+	/* Check if we can rollback, and collect relocated data */
+	ret = check_rollback(fs_info, reloc_ranges);
+	close_ctree(root);
+	if (ret < 0)
 		goto fail;
-	}
 
-	first_free = BTRFS_SUPER_INFO_OFFSET + 2 * sectorsize - 1;
-	first_free &= ~((u64)sectorsize - 1);
-	/* backup for extent #0 should exist */
-	if(!test_range_bit(&io_tree, 0, first_free - 1, EXTENT_LOCKED, 1)) {
-		error("no backup for the first extent");
-		goto fail;
-	}
-	/* force no allocation from system block group */
-	root->fs_info->system_allocs = -1;
-	trans = btrfs_start_transaction(root, 1);
-	if (!trans) {
-		error("unable to start transaction");
-		goto fail;
-	}
 	/*
-	 * recow the whole chunk tree, this will remove all chunk tree blocks
-	 * from system block group
+	 * Reloc data are saved in reloc_ranges, just pwrite them
+	 *
+	 * And write from backup roots, so if things goes wrong, we still
+	 * have a mountable btrfs
 	 */
-	chunk_root = root->fs_info->chunk_root;
-	memset(&key, 0, sizeof(key));
-	while (1) {
-		ret = btrfs_search_slot(trans, chunk_root, &key, &path, 0, 1);
-		if (ret < 0)
-			break;
+	for (i = 2; i >= 0; i--) {
+		u64 real_size;
 
-		ret = btrfs_next_leaf(chunk_root, &path);
-		if (ret)
-			break;
-
-		btrfs_item_key_to_cpu(path.nodes[0], &key, path.slots[0]);
-		btrfs_release_path(&path);
-	}
-	btrfs_release_path(&path);
-
-	offset = 0;
-	num_bytes = 0;
-	while(1) {
-		cache1 = btrfs_lookup_block_group(root->fs_info, offset);
-		if (!cache1)
-			break;
-
-		if (cache1->flags & BTRFS_BLOCK_GROUP_SYSTEM)
-			num_bytes += btrfs_block_group_used(&cache1->item);
-
-		offset = cache1->key.objectid + cache1->key.offset;
-	}
-	/* only extent #0 left in system block group? */
-	if (num_bytes > first_free) {
-		error(
-	"unable to empty system block group (num_bytes %llu, first_free %llu",
-				(unsigned long long)num_bytes,
-				(unsigned long long)first_free);
-		goto fail;
-	}
-	/* create a system chunk that maps the whole device */
-	ret = prepare_system_chunk_sb(root->fs_info->super_copy);
-	if (ret) {
-		error("unable to update system chunk: %d", ret);
-		goto fail;
-	}
-
-	ret = btrfs_commit_transaction(trans, root);
-	if (ret) {
-		error("transaction commit failed: %d", ret);
-		goto fail;
-	}
-
-	ret = close_ctree(root);
-	if (ret) {
-		error("close_ctree failed: %d", ret);
-		goto fail;
-	}
-
-	/* zero btrfs super block mirrors */
-	memset(buf, 0, sectorsize);
-	for (i = 1 ; i < BTRFS_SUPER_MIRROR_MAX; i++) {
-		bytenr = btrfs_sb_offset(i);
-		if (bytenr >= total_bytes)
-			break;
-		ret = pwrite(fd, buf, sectorsize, bytenr);
-		if (ret != sectorsize) {
-			error("zeroing superblock mirror %d failed: %d",
-					i, ret);
+		if (reserved_range_starts[i] > fsize)
+			continue;
+		real_size = min(reserved_range_starts[i] +
+				reserved_range_lens[i], (u64)fsize) -
+			    reserved_range_starts[i];
+		ret = pwrite(fd, reloc_ranges[i], real_size,
+			     reserved_range_starts[i]);
+		if (ret != real_size) {
+			if (ret < 0)
+				ret = -errno;
+			else
+				ret = -EIO;
+			error(
+			"failed to recover reserved range [%llu, %llu): %s",
+				reserved_range_starts[i],
+				reserved_range_starts[i] + real_size,
+				strerror(-ret));
 			goto fail;
 		}
+		ret = 0;
 	}
-
-	sb_bytenr = (u64)-1;
-	/* copy all relocated blocks back */
-	while(1) {
-		ret = find_first_extent_bit(&io_tree, 0, &start, &end,
-					    EXTENT_LOCKED);
-		if (ret)
-			break;
-
-		ret = get_state_private(&io_tree, start, &bytenr);
-		BUG_ON(ret);
-
-		clear_extent_bits(&io_tree, start, end, EXTENT_LOCKED,
-				  GFP_NOFS);
-
-		while (start <= end) {
-			if (start == BTRFS_SUPER_INFO_OFFSET) {
-				sb_bytenr = bytenr;
-				goto next_sector;
-			}
-			ret = pread(fd, buf, sectorsize, bytenr);
-			if (ret < 0) {
-				error("reading superblock at %llu failed: %d",
-						(unsigned long long)bytenr, ret);
-				goto fail;
-			}
-			BUG_ON(ret != sectorsize);
-			ret = pwrite(fd, buf, sectorsize, start);
-			if (ret < 0) {
-				error("writing superblock at %llu failed: %d",
-						(unsigned long long)start, ret);
-				goto fail;
-			}
-			BUG_ON(ret != sectorsize);
-next_sector:
-			start += sectorsize;
-			bytenr += sectorsize;
-		}
-	}
-
-	ret = fsync(fd);
-	if (ret < 0) {
-		error("fsync failed: %s", strerror(errno));
-		goto fail;
-	}
-	/*
-	 * finally, overwrite btrfs super block.
-	 */
-	ret = pread(fd, buf, sectorsize, sb_bytenr);
-	if (ret < 0) {
-		error("reading primary superblock failed: %s",
-				strerror(errno));
-		goto fail;
-	}
-	BUG_ON(ret != sectorsize);
-	ret = pwrite(fd, buf, sectorsize, BTRFS_SUPER_INFO_OFFSET);
-	if (ret < 0) {
-		error("writing primary superblock failed: %s",
-				strerror(errno));
-		goto fail;
-	}
-	BUG_ON(ret != sectorsize);
-	ret = fsync(fd);
-	if (ret < 0) {
-		error("fsync failed: %s", strerror(errno));
-		goto fail;
-	}
-
-	close(fd);
-	free(buf);
-	extent_io_tree_cleanup(&io_tree);
 	printf("rollback complete\n");
-	return 0;
-
 fail:
 	if (fd != -1)
 		close(fd);
-	free(buf);
-	error("rollback aborted");
-	return -1;
+free_mem:
+	for (i = 0; i < ARRAY_SIZE(reserved_range_starts); i++)
+		free(reloc_ranges[i]);
+
+	if (ret < 0)
+		error("rollback aborted");
+	return ret;
 }
 
 static void print_usage(void)
-- 
2.10.2




^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 6/6] btrfs-progs: convert-test: trigger chunk allocation after convert
  2016-12-19  6:56 [PATCH v3 0/6] Convert rollback rework for v4.9 Qu Wenruo
                   ` (4 preceding siblings ...)
  2016-12-19  6:56 ` [PATCH v3 5/6] btrfs-progs: convert: Switch to new rollback function Qu Wenruo
@ 2016-12-19  6:56 ` Qu Wenruo
  2016-12-21 14:35 ` [PATCH v3 0/6] Convert rollback rework for v4.9 David Sterba
  6 siblings, 0 replies; 18+ messages in thread
From: Qu Wenruo @ 2016-12-19  6:56 UTC (permalink / raw)
  To: linux-btrfs

Populate fs after convert so we can trigger data chunk allocation.
This can expose too restrict old rollback condition

Reported-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 tests/common         | 4 ++++
 tests/common.convert | 3 +++
 2 files changed, 7 insertions(+)

diff --git a/tests/common b/tests/common
index 571118a..51c2e26 100644
--- a/tests/common
+++ b/tests/common
@@ -486,6 +486,10 @@ generate_dataset() {
 				run_check $SUDO_HELPER ln -s "$dirpath/$long_filename" "$dirpath/slow_slink.$num"
 			done
 			;;
+		large)
+			run_check $SUDO_HELPER dd if=/dev/urandom bs=32M count=1 \
+				of="$dirpath/$dataset_type" >/dev/null 2>&1
+			;;
 	esac
 }
 
diff --git a/tests/common.convert b/tests/common.convert
index a2d3152..8c9242e 100644
--- a/tests/common.convert
+++ b/tests/common.convert
@@ -160,6 +160,9 @@ convert_test_post_checks_all() {
 	convert_test_post_check_checksums "$1"
 	convert_test_post_check_permissions "$2"
 	convert_test_post_check_acl "$3"
+
+	# Create a large file to trigger data chunk allocation
+	generate_dataset "large"
 	run_check_umount_test_dev
 }
 
-- 
2.10.2




^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 5/6] btrfs-progs: convert: Switch to new rollback function
  2016-12-19  6:56 ` [PATCH v3 5/6] btrfs-progs: convert: Switch to new rollback function Qu Wenruo
@ 2016-12-20  5:36   ` Chandan Rajendra
  2017-01-23 17:54   ` David Sterba
  1 sibling, 0 replies; 18+ messages in thread
From: Chandan Rajendra @ 2016-12-20  5:36 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Monday, December 19, 2016 02:56:41 PM Qu Wenruo wrote:
> Since we have the whole facilities needed to rollback, switch to the new
> rollback.
> 
> The new rollback function can handle the following things that old
> rollback either can't handle or just refuse to rollback:
> 
> 1) New converted btrfs which allocates new data chunk
>    This is due to the too restrict may_roll_back() condition, which is
>    never a good friend for new convert behavior.
> 
>    The new rollback behavior fixes it by not checking data chunks, but
>    only to check the file extents of the convert image file.
> 
>    If all file extents except the ones in reserved ranges, then we allow
>    rollback.
> 
> 2) New converted btrfs which enabled NO_HOLES feature
>    Thanks to previous patches, we can convert to real NO_HOLES btrfs.
> 
>    And since old rollback assumes that file extents and holes covers the
>    whole image file, it will fail due to the non-exists holes.
> 
>    Fix it by iterating file extents of convert image, and only compare
>    the size we checked against file size if NO_HOLES is not enabled.
> 
> And makes rollback function simpler:
> 
> 1) Read-n-write vs extra chunk tree relocation
>    Since converted btrfs only has 3 ranges that are not 1:1 mapped, just
>    read this data out, and close btrfs, finally write them into position
>    will be good enough, for both new convert and old convert.
>    (To be more specific, old convert is just a subset of more universal
>     new convert behavior)
> 
>    No extra work is needed any more, and we can even open the btrfs RO.

Thanks for fixing this. The patchset works fine on ppc64 and x86_64.

Tested-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>

-- 
chandan


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 0/6] Convert rollback rework for v4.9
  2016-12-19  6:56 [PATCH v3 0/6] Convert rollback rework for v4.9 Qu Wenruo
                   ` (5 preceding siblings ...)
  2016-12-19  6:56 ` [PATCH v3 6/6] btrfs-progs: convert-test: trigger chunk allocation after convert Qu Wenruo
@ 2016-12-21 14:35 ` David Sterba
  2016-12-22  1:53   ` Qu Wenruo
  6 siblings, 1 reply; 18+ messages in thread
From: David Sterba @ 2016-12-21 14:35 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Mon, Dec 19, 2016 at 02:56:36PM +0800, Qu Wenruo wrote:
> Can be fetched from github:
> https://github.com/adam900710/btrfs-progs.git convert_rework_for_4.9
> 
> This is mainly to fix problems exposed by Chandan's fix for 64K nodesize.

Thanks for the fix. That's a lot of new code that I don't feel
comfortable to put to the upcomming release, so I'll merge 1 and 6 now
and queue the rest for 4.9.1.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 0/6] Convert rollback rework for v4.9
  2016-12-21 14:35 ` [PATCH v3 0/6] Convert rollback rework for v4.9 David Sterba
@ 2016-12-22  1:53   ` Qu Wenruo
  0 siblings, 0 replies; 18+ messages in thread
From: Qu Wenruo @ 2016-12-22  1:53 UTC (permalink / raw)
  To: dsterba, linux-btrfs



At 12/21/2016 10:35 PM, David Sterba wrote:
> On Mon, Dec 19, 2016 at 02:56:36PM +0800, Qu Wenruo wrote:
>> Can be fetched from github:
>> https://github.com/adam900710/btrfs-progs.git convert_rework_for_4.9
>>
>> This is mainly to fix problems exposed by Chandan's fix for 64K nodesize.
>
> Thanks for the fix. That's a lot of new code that I don't feel
> comfortable to put to the upcomming release, so I'll merge 1 and 6 now
> and queue the rest for 4.9.1.

Feel free to delay them to 4.9.1.

But since we have regression, I think we'd better delay the following 
patches along with the rollback rework:

btrfs-progs: convert: Fix migrate_super_block() to work with 64k sectorsize

btrfs-progs: convert: Prevent accounting blocks beyond end of device

And the 6th patch of the patchset.


The 2 convert fixes are completely fine, but they slightly change the 
chunk layout, and makes 4K sectorsize convert test fails.
So is the 6th patch, which enhanced the testcases, to make them always 
fail if rollback rework is not applied.

So IMHO it's better to ensure the v4.9 release can pass self tests.
(Even we know it has bugs)

And fixes all the known convert bugs in v4.9.1.

How do you think about this idea?

Thanks,
Qu

> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 2/6] btrfs-progs: utils: Introduce basic set operations for range
  2016-12-19  6:56 ` [PATCH v3 2/6] btrfs-progs: utils: Introduce basic set operations for range Qu Wenruo
@ 2017-01-23 17:28   ` David Sterba
  2017-01-24  0:40     ` Qu Wenruo
  2017-01-23 17:40   ` David Sterba
  1 sibling, 1 reply; 18+ messages in thread
From: David Sterba @ 2017-01-23 17:28 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Mon, Dec 19, 2016 at 02:56:38PM +0800, Qu Wenruo wrote:
> Introduce basic set operations: is_subset() and is_intersection().
> 
> This is quite useful to check if a range [start, start + len) subset or
> intersection of another range.
> So we don't need to use open code to do it, which I sometimes do it
> wrong.
> 
> Also use these new facilities in btrfs-convert, to check if a range is a
> subset or intersects with btrfs convert reserved ranges.

I see the range helpers used only inside convert so I don't think we
need to export them into utils. Then you could introduce a helper
structure with start and len members and use that instead of 2 arrays

> --- a/disk-io.h
> +++ b/disk-io.h
> @@ -97,11 +97,16 @@ enum btrfs_read_sb_flags {
>  	SBREAD_PARTIAL		= (1 << 1),
>  };
>  
> +/*
> + * Use macro to define mirror super block position
> + * So we can use it in static array initializtion
> + */
> +#define BTRFS_SB_MIRROR_OFFSET(mirror)	((u64)(16 * 1024) << \
> +					 (BTRFS_SUPER_MIRROR_SHIFT * (mirror)))

This is unrelated change and should go separately.

>  static inline u64 btrfs_sb_offset(int mirror)
>  {
> -	u64 start = 16 * 1024;
>  	if (mirror)
> -		return start << (BTRFS_SUPER_MIRROR_SHIFT * mirror);
> +		return BTRFS_SB_MIRROR_OFFSET(mirror);
>  	return BTRFS_SUPER_INFO_OFFSET;
>  }
>  

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 2/6] btrfs-progs: utils: Introduce basic set operations for range
  2016-12-19  6:56 ` [PATCH v3 2/6] btrfs-progs: utils: Introduce basic set operations for range Qu Wenruo
  2017-01-23 17:28   ` David Sterba
@ 2017-01-23 17:40   ` David Sterba
  1 sibling, 0 replies; 18+ messages in thread
From: David Sterba @ 2017-01-23 17:40 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Mon, Dec 19, 2016 at 02:56:38PM +0800, Qu Wenruo wrote:
> +static u64 reserved_range_starts[3] = { 0, BTRFS_SB_MIRROR_OFFSET(1),
> +					BTRFS_SB_MIRROR_OFFSET(2) };
> +static u64 reserved_range_lens[3] = { 1024 * 1024, 64 * 1024, 64 * 1024 };

Also anywhere in the relevant code, the 3 should be better a named
constant and not either 3 or the ARRAY_SIZE, or 2 in some backward going
for loop I've seen.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 5/6] btrfs-progs: convert: Switch to new rollback function
  2016-12-19  6:56 ` [PATCH v3 5/6] btrfs-progs: convert: Switch to new rollback function Qu Wenruo
  2016-12-20  5:36   ` Chandan Rajendra
@ 2017-01-23 17:54   ` David Sterba
  2017-01-24  0:44     ` Qu Wenruo
  1 sibling, 1 reply; 18+ messages in thread
From: David Sterba @ 2017-01-23 17:54 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Mon, Dec 19, 2016 at 02:56:41PM +0800, Qu Wenruo wrote:
> Since we have the whole facilities needed to rollback, switch to the new
> rollback.

Sorry, the change from patch 4 to patch 5 seems too big to grasp for me,
reviewing is really hard and I'm not sure I could even do that. My
concern is namely about patch 5/6 that throws out a lot of code that
does not obviously map to the new code.

I can try again to see if there are points where the patch could be
split, but at the moment the patchset is too scary to merge.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 2/6] btrfs-progs: utils: Introduce basic set operations for range
  2017-01-23 17:28   ` David Sterba
@ 2017-01-24  0:40     ` Qu Wenruo
  0 siblings, 0 replies; 18+ messages in thread
From: Qu Wenruo @ 2017-01-24  0:40 UTC (permalink / raw)
  To: dsterba, linux-btrfs



At 01/24/2017 01:28 AM, David Sterba wrote:
> On Mon, Dec 19, 2016 at 02:56:38PM +0800, Qu Wenruo wrote:
>> Introduce basic set operations: is_subset() and is_intersection().
>>
>> This is quite useful to check if a range [start, start + len) subset or
>> intersection of another range.
>> So we don't need to use open code to do it, which I sometimes do it
>> wrong.
>>
>> Also use these new facilities in btrfs-convert, to check if a range is a
>> subset or intersects with btrfs convert reserved ranges.
>
> I see the range helpers used only inside convert so I don't think we
> need to export them into utils. Then you could introduce a helper
> structure with start and len members and use that instead of 2 arrays

Right, I could move it convert.

>
>> --- a/disk-io.h
>> +++ b/disk-io.h
>> @@ -97,11 +97,16 @@ enum btrfs_read_sb_flags {
>>  	SBREAD_PARTIAL		= (1 << 1),
>>  };
>>
>> +/*
>> + * Use macro to define mirror super block position
>> + * So we can use it in static array initializtion
>> + */
>> +#define BTRFS_SB_MIRROR_OFFSET(mirror)	((u64)(16 * 1024) << \
>> +					 (BTRFS_SUPER_MIRROR_SHIFT * (mirror)))
>
> This is unrelated change and should go separately.

OK, I can send out a patch first.

Thanks for reviewing,
Qu

>
>>  static inline u64 btrfs_sb_offset(int mirror)
>>  {
>> -	u64 start = 16 * 1024;
>>  	if (mirror)
>> -		return start << (BTRFS_SUPER_MIRROR_SHIFT * mirror);
>> +		return BTRFS_SB_MIRROR_OFFSET(mirror);
>>  	return BTRFS_SUPER_INFO_OFFSET;
>>  }
>>
>
>



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 5/6] btrfs-progs: convert: Switch to new rollback function
  2017-01-23 17:54   ` David Sterba
@ 2017-01-24  0:44     ` Qu Wenruo
  2017-01-24 16:37       ` David Sterba
  0 siblings, 1 reply; 18+ messages in thread
From: Qu Wenruo @ 2017-01-24  0:44 UTC (permalink / raw)
  To: dsterba, linux-btrfs



At 01/24/2017 01:54 AM, David Sterba wrote:
> On Mon, Dec 19, 2016 at 02:56:41PM +0800, Qu Wenruo wrote:
>> Since we have the whole facilities needed to rollback, switch to the new
>> rollback.
>
> Sorry, the change from patch 4 to patch 5 seems too big to grasp for me,
> reviewing is really hard and I'm not sure I could even do that. My
> concern is namely about patch 5/6 that throws out a lot of code that
> does not obviously map to the new code.
>
> I can try again to see if there are points where the patch could be
> split, but at the moment the patchset is too scary to merge.
>

So this implies the current implementation is not good enough for review.

I'll try to extract more more set operation and make the core part more 
refined, with more ascii art comment for it.

Thanks,
Qu



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 5/6] btrfs-progs: convert: Switch to new rollback function
  2017-01-24  0:44     ` Qu Wenruo
@ 2017-01-24 16:37       ` David Sterba
  2017-01-25  0:42         ` Qu Wenruo
  0 siblings, 1 reply; 18+ messages in thread
From: David Sterba @ 2017-01-24 16:37 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: dsterba, linux-btrfs

On Tue, Jan 24, 2017 at 08:44:00AM +0800, Qu Wenruo wrote:
> 
> 
> At 01/24/2017 01:54 AM, David Sterba wrote:
> > On Mon, Dec 19, 2016 at 02:56:41PM +0800, Qu Wenruo wrote:
> >> Since we have the whole facilities needed to rollback, switch to the new
> >> rollback.
> >
> > Sorry, the change from patch 4 to patch 5 seems too big to grasp for me,
> > reviewing is really hard and I'm not sure I could even do that. My
> > concern is namely about patch 5/6 that throws out a lot of code that
> > does not obviously map to the new code.
> >
> > I can try again to see if there are points where the patch could be
> > split, but at the moment the patchset is too scary to merge.
> >
> 
> So this implies the current implementation is not good enough for review.

I'd say the code hasn't been cleaned up for a long time so it's not good
enough for adding new features and doing broader fixes. The v2 rework
has fixed quite an important issue, but for other issues I'd rather get
smaller patches that eg. prepare the code for the final change.
Something that I can review without needing to reread the whole convert
and refresh memories about all details.

> I'll try to extract more more set operation and make the core part more 
> refined, with more ascii art comment for it.

The ascii diagrams help, the overall convert design could be also better
documented etc. At the moment I'd rather spend some time on cleaning up
the sources but also don't want to block the fixes you've been sending.
I need to think about that more.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 5/6] btrfs-progs: convert: Switch to new rollback function
  2017-01-24 16:37       ` David Sterba
@ 2017-01-25  0:42         ` Qu Wenruo
  2017-01-30 14:55           ` David Sterba
  0 siblings, 1 reply; 18+ messages in thread
From: Qu Wenruo @ 2017-01-25  0:42 UTC (permalink / raw)
  To: dsterba, linux-btrfs



At 01/25/2017 12:37 AM, David Sterba wrote:
> On Tue, Jan 24, 2017 at 08:44:00AM +0800, Qu Wenruo wrote:
>>
>>
>> At 01/24/2017 01:54 AM, David Sterba wrote:
>>> On Mon, Dec 19, 2016 at 02:56:41PM +0800, Qu Wenruo wrote:
>>>> Since we have the whole facilities needed to rollback, switch to the new
>>>> rollback.
>>>
>>> Sorry, the change from patch 4 to patch 5 seems too big to grasp for me,
>>> reviewing is really hard and I'm not sure I could even do that. My
>>> concern is namely about patch 5/6 that throws out a lot of code that
>>> does not obviously map to the new code.
>>>
>>> I can try again to see if there are points where the patch could be
>>> split, but at the moment the patchset is too scary to merge.
>>>
>>
>> So this implies the current implementation is not good enough for review.
>
> I'd say the code hasn't been cleaned up for a long time so it's not good
> enough for adding new features and doing broader fixes. The v2 rework
> has fixed quite an important issue, but for other issues I'd rather get
> smaller patches that eg. prepare the code for the final change.
> Something that I can review without needing to reread the whole convert
> and refresh memories about all details.
>
>> I'll try to extract more more set operation and make the core part more
>> refined, with more ascii art comment for it.
>
> The ascii diagrams help, the overall convert design could be also better
> documented etc. At the moment I'd rather spend some time on cleaning up
> the sources but also don't want to block the fixes you've been sending.
> I need to think about that more.

Feel free to block the rework.

I'll start from sending out basic documentations explaining the logic 
behind convert/rollback, which should help review.

Thanks,
Qu



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 5/6] btrfs-progs: convert: Switch to new rollback function
  2017-01-25  0:42         ` Qu Wenruo
@ 2017-01-30 14:55           ` David Sterba
  0 siblings, 0 replies; 18+ messages in thread
From: David Sterba @ 2017-01-30 14:55 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: dsterba, linux-btrfs

On Wed, Jan 25, 2017 at 08:42:01AM +0800, Qu Wenruo wrote:
> >> So this implies the current implementation is not good enough for review.
> >
> > I'd say the code hasn't been cleaned up for a long time so it's not good
> > enough for adding new features and doing broader fixes. The v2 rework
> > has fixed quite an important issue, but for other issues I'd rather get
> > smaller patches that eg. prepare the code for the final change.
> > Something that I can review without needing to reread the whole convert
> > and refresh memories about all details.
> >
> >> I'll try to extract more more set operation and make the core part more
> >> refined, with more ascii art comment for it.
> >
> > The ascii diagrams help, the overall convert design could be also better
> > documented etc. At the moment I'd rather spend some time on cleaning up
> > the sources but also don't want to block the fixes you've been sending.
> > I need to think about that more.
> 
> Feel free to block the rework.
> 
> I'll start from sending out basic documentations explaining the logic 
> behind convert/rollback, which should help review.

FYI, I've reorganized the convert files a bit, this patchset does not
apply anymore, but I'm expecting some more changes to it so please adapt
it to the new file structure.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-01-30 15:14 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-19  6:56 [PATCH v3 0/6] Convert rollback rework for v4.9 Qu Wenruo
2016-12-19  6:56 ` [PATCH v3 1/6] btrfs-progs: file-item: Fix wrong file extents inserted Qu Wenruo
2016-12-19  6:56 ` [PATCH v3 2/6] btrfs-progs: utils: Introduce basic set operations for range Qu Wenruo
2017-01-23 17:28   ` David Sterba
2017-01-24  0:40     ` Qu Wenruo
2017-01-23 17:40   ` David Sterba
2016-12-19  6:56 ` [PATCH v3 3/6] btrfs-progs: convert: Introduce function to record relocated ranges Qu Wenruo
2016-12-19  6:56 ` [PATCH v3 4/6] btrfs-progs: convert: Introduce new function to check if we can rollback Qu Wenruo
2016-12-19  6:56 ` [PATCH v3 5/6] btrfs-progs: convert: Switch to new rollback function Qu Wenruo
2016-12-20  5:36   ` Chandan Rajendra
2017-01-23 17:54   ` David Sterba
2017-01-24  0:44     ` Qu Wenruo
2017-01-24 16:37       ` David Sterba
2017-01-25  0:42         ` Qu Wenruo
2017-01-30 14:55           ` David Sterba
2016-12-19  6:56 ` [PATCH v3 6/6] btrfs-progs: convert-test: trigger chunk allocation after convert Qu Wenruo
2016-12-21 14:35 ` [PATCH v3 0/6] Convert rollback rework for v4.9 David Sterba
2016-12-22  1:53   ` Qu Wenruo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.