linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] btrfs: refactor scrub entrances for each profile
@ 2022-01-07  2:34 Qu Wenruo
  2022-01-07  2:34 ` [PATCH 1/4] btrfs: introduce a helper to locate an extent item Qu Wenruo
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Qu Wenruo @ 2022-01-07  2:34 UTC (permalink / raw)
  To: linux-btrfs

The branch is based on several recent submitted small cleanups, thus
it's better to fetch the branch from github:

https://github.com/adam900710/linux/tree/refactor_scrub

[CRAP-BUT-IT-WORKS(TM)]

Scrub is one of the area we seldom touch because:

- It's a mess
  Just check scrub_stripe() function.
  It's a function scrubbing a stripe for *all* profiles.

  It's near 400 lines for a single complex function, with double while()
  loop and several different jumps inside the loop.

  Not to mention the lack of comments for various structures.

  This should and will never happen under our current code standard.

- It just works
  I have hit more than 10 bugs during development, and I just want to
  give up the refactor, as even the code is crap, it works, passing the
  existing scrub/replace group.
  While no matter how small code change I'm doing, it always fails to pass
  the same tests.

[REFACTOR-IDEA]

The core idea here, is to get rid of one-fit-all solution for
scrub_stripe().

Instead, we explicitly separate the scrub into 3 groups (the idea is
from my btrfs-fuse project):

- Simple-mirror based profiles
  This includes SINGLE/DUP/RAID1/RAID1C* profiles.
  They have no stripe, and their repair is purely mirror based.

- Simple-stripe based profiles
  This includes RAID0/RAID10 profiles.
  They are just simple stripe (without P/Q nor rotation), with extra
  mirrors to support their repair.

- RAID56
  The most complex profiles, they have extra P/Q, and have rotation.

[REFACTOR-IMPLEMENTATION]

So we have 3 entrances for all those supported profiles:

- scrub_simple_mirror()
  For SINGLE/DUP/RAID1/RAID1C* profiles.
  Just go through each extent and scrub the extent.

- scrub_simple_stripe()
  For RAID0/RAID10 profiles.
  Instead we go each data stripe first, then inside each data stripe, we
  can call scrub_simple_mirror(), since after stripe split, RAID0 is
  just SINGLE and RAID10 is just RAID1.

- scrub_stripe() untouched for RAID56
  RAID56 still has all the complex things to do, but they can still be
  split into two types (already done by the original code)

  * data stripes
    They are no different in the verification part, RAID56 is just
    SINGLE if we ignore the repair path.
    It's only in repair path that our path divides.

    So we can reuse scrub_simple_mirror() again.

  * P/Q stripes
    They already have a dedicated function handling the case.

With all these refactors, although we have several more functions, we
get rid of:

- A double while () loop
- Several jumps inside the double loop
- Complex calculation to try to fit all profiles

And we get:

- Better comments
- More dedicated functions
- A better basis for further refactors

[INCOMING CLEANUPS]

- Use find_first_extent_item() to cleanup the RAID56 code
  This part itself can be as large this patchset already, thus will be
  in its own patchset.

- Refactor scrub_pages/scrub_parity/... structures

Qu Wenruo (4):
  btrfs: introduce a helper to locate an extent item
  btrfs: introduce dedicated helper to scrub simple-mirror based range
  btrfs: introduce dedicated helper to scrub simple-stripe based range
  btrfs: use scrub_simple_mirror() to handle RAID56 data stripe scrub

 fs/btrfs/scrub.c | 702 ++++++++++++++++++++++++++++-------------------
 1 file changed, 422 insertions(+), 280 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/4] btrfs: introduce a helper to locate an extent item
  2022-01-07  2:34 [PATCH 0/4] btrfs: refactor scrub entrances for each profile Qu Wenruo
@ 2022-01-07  2:34 ` Qu Wenruo
  2022-01-07  2:34 ` [PATCH 2/4] btrfs: introduce dedicated helper to scrub simple-mirror based range Qu Wenruo
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Qu Wenruo @ 2022-01-07  2:34 UTC (permalink / raw)
  To: linux-btrfs

The new helper, find_first_extent_item(), will locate an extent item
(either EXTENT_ITEM or METADATA_ITEM) which covers the any byte of the
search range.

This helper will later be used to refactor scrub code.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/scrub.c | 108 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 108 insertions(+)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 2e9a322773f2..ee179eb2ea17 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -2882,6 +2882,114 @@ static void scrub_parity_put(struct scrub_parity *sparity)
 	scrub_parity_check_and_repair(sparity);
 }
 
+/*
+ * Return 0 if the extent item range covers any byte of the range.
+ * Return <0 if the extent item is before @search_start.
+ * Return >0 if the extent item is after @start_start + @search_len.
+ */
+static int compare_extent_item_range(struct btrfs_path *path,
+				     u64 search_start, u64 search_len)
+{
+	struct btrfs_fs_info *fs_info = path->nodes[0]->fs_info;
+	u64 len;
+	struct btrfs_key key;
+
+	btrfs_item_key_to_cpu(path->nodes[0], &key, path->slots[0]);
+	ASSERT(key.type == BTRFS_EXTENT_ITEM_KEY ||
+	       key.type == BTRFS_METADATA_ITEM_KEY);
+	if (key.type == BTRFS_METADATA_ITEM_KEY)
+		len = fs_info->nodesize;
+	else
+		len = key.offset;
+
+	if (key.objectid + len <= search_start)
+		return -1;
+	if (key.objectid >= search_start + search_len)
+		return 1;
+	return 0;
+}
+
+/*
+ * Helper to locate one extent item which covers any byte in range
+ * [@search_start, @search_start + @search_length)
+ *
+ * If the path is not initialized, we will initialize the search by doing
+ * a btrfs_search_slot().
+ * If the path is already initialized, we will use the path as the initial
+ * slot, to avoid duplicated btrfs_search_slot() calls.
+ *
+ * NOTE: If an extent item starts before @search_start, we will still
+ * return the extent item. This is for data extent crossing stripe boundary.
+ *
+ * Return 0 if we found such extent item, and @path will point to the
+ * extent item.
+ * Return >0 if no such extent item can be found, and @path will be released.
+ * Return <0 if hit fatal error, and @path will be released.
+ */
+static int find_first_extent_item(struct btrfs_root *extent_root,
+				  struct btrfs_path *path,
+				  u64 search_start, u64 search_len)
+{
+	struct btrfs_fs_info *fs_info = extent_root->fs_info;
+	struct btrfs_key key;
+	int ret;
+
+	/* Continue using the existing path */
+	if (path->nodes[0])
+		goto search_forward;
+
+	if (btrfs_fs_incompat(fs_info, SKINNY_METADATA))
+		key.type = BTRFS_METADATA_ITEM_KEY;
+	else
+		key.type = BTRFS_EXTENT_ITEM_KEY;
+	key.objectid = search_start;
+	key.offset = (u64)-1;
+
+	ret = btrfs_search_slot(NULL, extent_root, &key, path, 0, 0);
+	if (ret < 0)
+		return ret;
+
+	ASSERT(ret > 0);
+	/*
+	 * Here we intentionally pass 0 as @min_objectid, as there could be
+	 * an extent item starting before @search_start.
+	 */
+	ret = btrfs_previous_extent_item(extent_root, path, 0);
+	if (ret < 0)
+		return ret;
+	/*
+	 * No matter whether we have found an extent item, the next loop will
+	 * properly do every check on the key.
+	 */
+search_forward:
+	while (true) {
+		btrfs_item_key_to_cpu(path->nodes[0], &key, path->slots[0]);
+		if (key.objectid >= search_start + search_len)
+			break;
+		if (key.type != BTRFS_METADATA_ITEM_KEY &&
+		    key.type != BTRFS_EXTENT_ITEM_KEY)
+			goto next;
+
+		ret = compare_extent_item_range(path, search_start, search_len);
+		if (ret == 0)
+			return ret;
+		if (ret > 0)
+			break;
+next:
+		path->slots[0]++;
+		if (path->slots[0] >= btrfs_header_nritems(path->nodes[0])) {
+			ret = btrfs_next_leaf(extent_root, path);
+			if (ret) {
+				/* Either no more item or fatal error */
+				btrfs_release_path(path);
+				return ret;
+			}
+		}
+	}
+	btrfs_release_path(path);
+	return 1;
+}
+
 static noinline_for_stack int scrub_raid56_parity(struct scrub_ctx *sctx,
 						  struct map_lookup *map,
 						  struct btrfs_device *sdev,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/4] btrfs: introduce dedicated helper to scrub simple-mirror based range
  2022-01-07  2:34 [PATCH 0/4] btrfs: refactor scrub entrances for each profile Qu Wenruo
  2022-01-07  2:34 ` [PATCH 1/4] btrfs: introduce a helper to locate an extent item Qu Wenruo
@ 2022-01-07  2:34 ` Qu Wenruo
  2022-01-07  2:34 ` [PATCH 3/4] btrfs: introduce dedicated helper to scrub simple-stripe " Qu Wenruo
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Qu Wenruo @ 2022-01-07  2:34 UTC (permalink / raw)
  To: linux-btrfs

The new helper, scrub_simple_mirror(), will scrub all extents inside a range
which only has simple mirror based duplication.

This covers every range of SINGLE/DUP/RAID1/RAID1C*, and inside each
data stripe for RAID0/RAID10.

Currently we will use this function to scrub SINGLE/DUP/RAID1/RAID1C*
profiles.
As one can see, the new entrance for those simple-mirror based profiles
can be small enough (with comments, just reach 100 lines).

This function will be the basis for the incoming scrub refactor.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/scrub.c | 190 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 190 insertions(+)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index ee179eb2ea17..7b39705502b4 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -2990,6 +2990,26 @@ static int find_first_extent_item(struct btrfs_root *extent_root,
 	return 1;
 }
 
+static void get_extent_info(struct btrfs_path *path, u64 *extent_start_ret,
+			    u64 *size_ret, u64 *flags_ret, u64 *generation_ret)
+{
+	struct btrfs_key key;
+	struct btrfs_extent_item *ei;
+
+	btrfs_item_key_to_cpu(path->nodes[0], &key, path->slots[0]);
+	ASSERT(key.type == BTRFS_METADATA_ITEM_KEY ||
+	       key.type == BTRFS_EXTENT_ITEM_KEY);
+	*extent_start_ret = key.objectid;
+	if (key.type == BTRFS_METADATA_ITEM_KEY)
+		*size_ret = path->nodes[0]->fs_info->nodesize;
+	else
+		*size_ret = key.offset;
+	ei = btrfs_item_ptr(path->nodes[0], path->slots[0],
+			    struct btrfs_extent_item);
+	*flags_ret = btrfs_extent_flags(path->nodes[0], ei);
+	*generation_ret = btrfs_extent_generation(path->nodes[0], ei);
+}
+
 static noinline_for_stack int scrub_raid56_parity(struct scrub_ctx *sctx,
 						  struct map_lookup *map,
 						  struct btrfs_device *sdev,
@@ -3273,6 +3293,152 @@ static int sync_write_pointer_for_zoned(struct scrub_ctx *sctx, u64 logical,
 	return ret;
 }
 
+static bool does_range_cross_boundary(u64 extent_start, u64 extent_len,
+				      u64 boundary_start, u64 boudary_len)
+{
+	return (extent_start < boundary_start &&
+		extent_start + extent_len > boundary_start) ||
+	       (extent_start < boundary_start + boudary_len &&
+		extent_start + extent_len > boundary_start + boudary_len);
+}
+
+/*
+ * Scrub one range which can only has simple mirror based profile.
+ * (Including all range in SINGLE/DUP/RAID1/RAID1C*, and each stripe in
+ *  RAID0/RAID10).
+ *
+ * Since we may need to handle a subset of block group, we need @logical_start
+ * and @logical_length parameter.
+ */
+static int scrub_simple_mirror(struct scrub_ctx *sctx,
+				struct btrfs_root *extent_root,
+				struct btrfs_root *csum_root,
+				struct btrfs_block_group *bg,
+				struct map_lookup *map,
+				u64 logical_start, u64 logical_length,
+				struct btrfs_device *device,
+				u64 physical, int mirror_num)
+{
+	struct btrfs_fs_info *fs_info = sctx->fs_info;
+	const u64 logical_end = logical_start + logical_length;
+	/* An artificial limit, inherit from old scrub behavior */
+	const u32 max_length = SZ_64K;
+	struct btrfs_path path = {};
+	u64 cur_logical = logical_start;
+	int ret;
+
+	/* The range must be inside the bg */
+	ASSERT(logical_start >= bg->start &&
+	       logical_end <= bg->start + bg->length);
+
+	path.search_commit_root = 1;
+	path.skip_locking = 1;
+	/* Go through each */
+	while (cur_logical < logical_end) {
+		int cur_mirror = mirror_num;
+		struct btrfs_device *target_dev = device;
+		u64 extent_start;
+		u64 extent_len;
+		u64 extent_flags;
+		u64 extent_gen;
+		u64 scrub_len;
+		u64 cur_physical;
+
+		/* Canceled ? */
+		if (atomic_read(&fs_info->scrub_cancel_req) ||
+		    atomic_read(&sctx->cancel_req)) {
+			ret = -ECANCELED;
+			break;
+		}
+		/* Paused ? */
+		if (atomic_read(&fs_info->scrub_pause_req)) {
+			/* Push queued extents */
+			sctx->flush_all_writes = true;
+			scrub_submit(sctx);
+			mutex_lock(&sctx->wr_lock);
+			scrub_wr_submit(sctx);
+			mutex_unlock(&sctx->wr_lock);
+			wait_event(sctx->list_wait,
+				   atomic_read(&sctx->bios_in_flight) == 0);
+			sctx->flush_all_writes = false;
+			scrub_blocked_if_needed(fs_info);
+		}
+		/* Block group removed? */
+		spin_lock(&bg->lock);
+		if (bg->removed) {
+			spin_unlock(&bg->lock);
+			ret = 0;
+			break;
+		}
+		spin_unlock(&bg->lock);
+
+		ret = find_first_extent_item(extent_root, &path, cur_logical,
+					     logical_end - cur_logical);
+		if (ret > 0) {
+			/* No more extent, just update the accounting */
+			sctx->stat.last_physical = physical + logical_length;
+			ret = 0;
+			break;
+		}
+		if (ret < 0)
+			break;
+		get_extent_info(&path, &extent_start, &extent_len,
+				&extent_flags, &extent_gen);
+		/* Skip hole range which doesn't have any extent */
+		cur_logical = max(extent_start, cur_logical);
+
+		/*
+		 * Scrub len has three limits:
+		 * - Extent size limit
+		 * - Scrub range limit
+		 *   This is especially imporatant for RAID0/RAID10 to reuse
+		 *   this function
+		 * - Max scrub size limit
+		 */
+		scrub_len = min(min(extent_start + extent_len,
+				    logical_end), cur_logical + max_length) -
+			    cur_logical;
+		cur_physical = cur_logical - logical_start + physical;
+
+		if (sctx->is_dev_replace)
+			scrub_remap_extent(fs_info, cur_logical, scrub_len,
+					   &cur_physical, &target_dev, &cur_mirror);
+		if (extent_flags & BTRFS_EXTENT_FLAG_DATA) {
+			ret = btrfs_lookup_csums_range(csum_root, cur_logical,
+					cur_logical + scrub_len - 1,
+					&sctx->csum_list, 1);
+			if (ret)
+				break;
+		}
+		if ((extent_flags & BTRFS_EXTENT_FLAG_TREE_BLOCK) &&
+		    does_range_cross_boundary(extent_start, extent_len,
+			    		      logical_start, logical_length)) {
+			btrfs_err(fs_info,
+"scrub: tree block %llu spanning boundaries, ignored. boundary=[%llu, %llu)",
+				  extent_start, logical_start, logical_end);
+			spin_lock(&sctx->stat_lock);
+			sctx->stat.uncorrectable_errors++;
+			spin_unlock(&sctx->stat_lock);
+			cur_logical += scrub_len;
+			continue;
+		}
+		ret = scrub_extent(sctx, map, cur_logical, scrub_len, cur_physical,
+				   target_dev, extent_flags, extent_gen,
+				   cur_mirror, cur_logical - logical_start +
+				   physical);
+		scrub_free_csums(sctx);
+		if (ret)
+			break;
+		if (sctx->is_dev_replace)
+			sync_replace_for_zoned(sctx);
+		cur_logical += scrub_len;
+		/* Don't hold CPU for too long time */
+		cond_resched();
+	}
+	btrfs_release_path(&path);
+	return ret;
+}
+
 static noinline_for_stack int scrub_stripe(struct scrub_ctx *sctx,
 					   struct btrfs_block_group *bg,
 					   struct map_lookup *map,
@@ -3285,6 +3451,7 @@ static noinline_for_stack int scrub_stripe(struct scrub_ctx *sctx,
 	struct btrfs_root *csum_root;
 	struct btrfs_extent_item *extent;
 	struct blk_plug plug;
+	const u64 profile = map->type & BTRFS_BLOCK_GROUP_PROFILE_MASK;
 	const u64 chunk_logical = bg->start;
 	u64 flags;
 	int ret;
@@ -3379,6 +3546,29 @@ static noinline_for_stack int scrub_stripe(struct scrub_ctx *sctx,
 		sctx->flush_all_writes = true;
 	}
 
+	/*
+	 * There used to be a big double loop to handle all profiles using the
+	 * same routine, which grows larger and more gross over time.
+	 *
+	 * So here we handle each profile differently, so simpler profiles
+	 * have simpler scrubing function.
+	 */
+	if (!(profile & (BTRFS_BLOCK_GROUP_RAID0 | BTRFS_BLOCK_GROUP_RAID10 |
+			 BTRFS_BLOCK_GROUP_RAID56_MASK))) {
+		/*
+		 * Above check rules out all complex profile, the remaining
+		 * profiles are SINGLE|DUP|RAID1|RAID1C*, which is simple
+		 * mirrored duplication without stripe.
+		 *
+		 * Only @phsyical and @mirror_num needs to calculated using
+		 * @stripe_index.
+		 */
+		ret = scrub_simple_mirror(sctx, root, csum_root, bg, map,
+				bg->start, bg->length, scrub_dev,
+				map->stripes[stripe_index].physical,
+				stripe_index + 1);
+		goto out;
+	}
 	/*
 	 * now find all extents for each stripe and scrub them
 	 */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/4] btrfs: introduce dedicated helper to scrub simple-stripe based range
  2022-01-07  2:34 [PATCH 0/4] btrfs: refactor scrub entrances for each profile Qu Wenruo
  2022-01-07  2:34 ` [PATCH 1/4] btrfs: introduce a helper to locate an extent item Qu Wenruo
  2022-01-07  2:34 ` [PATCH 2/4] btrfs: introduce dedicated helper to scrub simple-mirror based range Qu Wenruo
@ 2022-01-07  2:34 ` Qu Wenruo
  2022-01-07  2:34 ` [PATCH 4/4] btrfs: use scrub_simple_mirror() to handle RAID56 data stripe scrub Qu Wenruo
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Qu Wenruo @ 2022-01-07  2:34 UTC (permalink / raw)
  To: linux-btrfs

The new entrance will iterate through each data stripe which belongs to
the target device.

And since inside each data stripe, RAID0 is just SINGLE, while RAID10 is
just RAID1, we can reuse scrub_simple_mirror() to do the scrub properly.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/scrub.c | 100 +++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 88 insertions(+), 12 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 7b39705502b4..5dcdebd59b93 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -3010,6 +3010,15 @@ static void get_extent_info(struct btrfs_path *path, u64 *extent_start_ret,
 	*generation_ret = btrfs_extent_generation(path->nodes[0], ei);
 }
 
+static bool does_range_cross_boundary(u64 extent_start, u64 extent_len,
+				      u64 boundary_start, u64 boudary_len)
+{
+	return (extent_start < boundary_start &&
+		extent_start + extent_len > boundary_start) ||
+	       (extent_start < boundary_start + boudary_len &&
+		extent_start + extent_len > boundary_start + boudary_len);
+}
+
 static noinline_for_stack int scrub_raid56_parity(struct scrub_ctx *sctx,
 						  struct map_lookup *map,
 						  struct btrfs_device *sdev,
@@ -3293,15 +3302,6 @@ static int sync_write_pointer_for_zoned(struct scrub_ctx *sctx, u64 logical,
 	return ret;
 }
 
-static bool does_range_cross_boundary(u64 extent_start, u64 extent_len,
-				      u64 boundary_start, u64 boudary_len)
-{
-	return (extent_start < boundary_start &&
-		extent_start + extent_len > boundary_start) ||
-	       (extent_start < boundary_start + boudary_len &&
-		extent_start + extent_len > boundary_start + boudary_len);
-}
-
 /*
  * Scrub one range which can only has simple mirror based profile.
  * (Including all range in SINGLE/DUP/RAID1/RAID1C*, and each stripe in
@@ -3439,6 +3439,77 @@ static int scrub_simple_mirror(struct scrub_ctx *sctx,
 	return ret;
 }
 
+/* Calculate the full stripe length for simple stripe based profiles */
+static u64 simple_stripe_full_stripe_len(struct map_lookup *map)
+{
+	ASSERT(map->type & (BTRFS_BLOCK_GROUP_RAID0 |
+			    BTRFS_BLOCK_GROUP_RAID10));
+
+	return map->num_stripes / map->sub_stripes * map->stripe_len;
+}
+
+/* Get the logical bytenr for the stripe */
+static u64 simple_stripe_get_logical(struct map_lookup *map,
+				     struct btrfs_block_group *bg,
+				     int stripe_index)
+{
+	ASSERT(map->type & (BTRFS_BLOCK_GROUP_RAID0 |
+			    BTRFS_BLOCK_GROUP_RAID10));
+	ASSERT(stripe_index < map->num_stripes);
+
+	/*
+	 * (stripe_index / sub_stripes) gives how many data stripes we need to
+	 * skip.
+	 */
+	return (stripe_index / map->sub_stripes) * map->stripe_len + bg->start;
+}
+
+/* Get the mirror number for the stripe */
+static int simple_stripe_mirror_num(struct map_lookup *map, int stripe_index)
+{
+	ASSERT(map->type & (BTRFS_BLOCK_GROUP_RAID0 |
+			    BTRFS_BLOCK_GROUP_RAID10));
+	ASSERT(stripe_index < map->num_stripes);
+
+	/* For RAID0, it's fixed to 1, for RAID10 it's 0,1,0,1... */
+	return stripe_index % map->sub_stripes + 1;
+}
+
+static int scrub_simple_stripe(struct scrub_ctx *sctx,
+				struct btrfs_root *extent_root,
+				struct btrfs_root *csum_root,
+				struct btrfs_block_group *bg,
+				struct map_lookup *map,
+				struct btrfs_device *device,
+				int stripe_index)
+{
+	const u64 logical_increment = simple_stripe_full_stripe_len(map);
+	const u64 orig_logical = simple_stripe_get_logical(map, bg, stripe_index);
+	const u64 orig_physical = map->stripes[stripe_index].physical;
+	const int mirror_num = simple_stripe_mirror_num(map, stripe_index);
+	u64 cur_logical = orig_logical;
+	u64 cur_physical = orig_physical;
+	int ret = 0;
+
+	while (cur_logical < bg->start + bg->length) {
+		/*
+		 * Inside each stripe, RAID0 is just SINGLE, and RAID10 is
+		 * just RAID1, so we can reuse scrub_simple_mirror() to scrub
+		 * this stripe.
+		 */
+		ret = scrub_simple_mirror(sctx, extent_root, csum_root, bg, map,
+					  cur_logical, map->stripe_len, device,
+					  cur_physical, mirror_num);
+		if (ret)
+			return ret;
+		/* Skip to next stripe which belongs to the target device */
+		cur_logical += logical_increment;
+		/* For physical offset, we just go to next stripe */
+		cur_physical += map->stripe_len;
+	}
+	return ret;
+}
+
 static noinline_for_stack int scrub_stripe(struct scrub_ctx *sctx,
 					   struct btrfs_block_group *bg,
 					   struct map_lookup *map,
@@ -3569,9 +3640,14 @@ static noinline_for_stack int scrub_stripe(struct scrub_ctx *sctx,
 				stripe_index + 1);
 		goto out;
 	}
-	/*
-	 * now find all extents for each stripe and scrub them
-	 */
+	if (profile & (BTRFS_BLOCK_GROUP_RAID0 | BTRFS_BLOCK_GROUP_RAID10)) {
+		ret = scrub_simple_stripe(sctx, root, csum_root, bg, map,
+					  scrub_dev, stripe_index);
+		goto out;
+	}
+
+	/* Only RAID56 goes through the old code */
+	ASSERT(map->type & BTRFS_BLOCK_GROUP_RAID56_MASK);
 	ret = 0;
 	while (physical < physical_end) {
 		/*
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/4] btrfs: use scrub_simple_mirror() to handle RAID56 data stripe scrub
  2022-01-07  2:34 [PATCH 0/4] btrfs: refactor scrub entrances for each profile Qu Wenruo
                   ` (2 preceding siblings ...)
  2022-01-07  2:34 ` [PATCH 3/4] btrfs: introduce dedicated helper to scrub simple-stripe " Qu Wenruo
@ 2022-01-07  2:34 ` Qu Wenruo
  2022-01-07 14:44 ` [PATCH 0/4] btrfs: refactor scrub entrances for each profile David Sterba
  2022-01-19  5:52 ` Qu Wenruo
  5 siblings, 0 replies; 10+ messages in thread
From: Qu Wenruo @ 2022-01-07  2:34 UTC (permalink / raw)
  To: linux-btrfs

Although RAID56 has complex repair mechanism, which involves reading the
whole full stripe, but for current data stripe scrub, it's in fact no
different than SINGLE/RAID1.

The point here is, for data stripe we just check the csum for each
extent we hit.
Only for csum mismatch case, our repair path divides.

So we can still reuse scrub_simple_mirror() for RAID56 data stripes.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/scrub.c | 330 +++++++----------------------------------------
 1 file changed, 49 insertions(+), 281 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 5dcdebd59b93..9c6954d7f604 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -3520,66 +3520,27 @@ static noinline_for_stack int scrub_stripe(struct scrub_ctx *sctx,
 	struct btrfs_fs_info *fs_info = sctx->fs_info;
 	struct btrfs_root *root;
 	struct btrfs_root *csum_root;
-	struct btrfs_extent_item *extent;
 	struct blk_plug plug;
 	const u64 profile = map->type & BTRFS_BLOCK_GROUP_PROFILE_MASK;
 	const u64 chunk_logical = bg->start;
-	u64 flags;
 	int ret;
-	int slot;
 	u64 nstripes;
-	struct extent_buffer *l;
 	u64 physical;
 	u64 logical;
 	u64 logic_end;
 	u64 physical_end;
-	u64 generation;
-	int mirror_num;
-	struct btrfs_key key;
-	u64 increment = map->stripe_len;
-	u64 offset;
-	u64 extent_logical;
-	u64 extent_physical;
-	/*
-	 * Unlike chunk length, extent length should never go beyond
-	 * BTRFS_MAX_EXTENT_SIZE, thus u32 is enough here.
-	 */
-	u32 extent_len;
+	u64 increment;	/* The logical increment after finishing one stripe */
+	u64 offset;	/* Offset inside the chunk */
 	u64 stripe_logical;
 	u64 stripe_end;
-	struct btrfs_device *extent_dev;
-	int extent_mirror_num;
 	int stop_loop = 0;
 
-	physical = map->stripes[stripe_index].physical;
-	offset = 0;
-	nstripes = div64_u64(dev_extent_len, map->stripe_len);
-	mirror_num = 1;
-	increment = map->stripe_len;
-	if (map->type & BTRFS_BLOCK_GROUP_RAID0) {
-		offset = map->stripe_len * stripe_index;
-		increment = map->stripe_len * map->num_stripes;
-	} else if (map->type & BTRFS_BLOCK_GROUP_RAID10) {
-		int factor = map->num_stripes / map->sub_stripes;
-		offset = map->stripe_len * (stripe_index / map->sub_stripes);
-		increment = map->stripe_len * factor;
-		mirror_num = stripe_index % map->sub_stripes + 1;
-	} else if (map->type & BTRFS_BLOCK_GROUP_RAID1_MASK) {
-		mirror_num = stripe_index % map->num_stripes + 1;
-	} else if (map->type & BTRFS_BLOCK_GROUP_DUP) {
-		mirror_num = stripe_index % map->num_stripes + 1;
-	} else if (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) {
-		get_raid56_logic_offset(physical, stripe_index, map, &offset,
-					NULL);
-		increment = map->stripe_len * nr_data_stripes(map);
-	}
-
 	path = btrfs_alloc_path();
 	if (!path)
 		return -ENOMEM;
 
 	/*
-	 * work on commit root. The related disk blocks are static as
+	 * Work on commit root. The related disk blocks are static as
 	 * long as COW is applied. This means, it is save to rewrite
 	 * them to repair disk errors without any race conditions
 	 */
@@ -3587,32 +3548,24 @@ static noinline_for_stack int scrub_stripe(struct scrub_ctx *sctx,
 	path->skip_locking = 1;
 	path->reada = READA_FORWARD;
 
-	logical = chunk_logical + offset;
-	physical_end = physical + nstripes * map->stripe_len;
-	if (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) {
-		get_raid56_logic_offset(physical_end, stripe_index,
-					map, &logic_end, NULL);
-		logic_end += chunk_logical;
-	} else {
-		logic_end = logical + increment * nstripes;
-	}
 	wait_event(sctx->list_wait,
 		   atomic_read(&sctx->bios_in_flight) == 0);
 	scrub_blocked_if_needed(fs_info);
 
-	root = btrfs_extent_root(fs_info, logical);
-	csum_root = btrfs_csum_root(fs_info, logical);
+	root = btrfs_extent_root(fs_info, bg->start);
+	csum_root = btrfs_csum_root(fs_info, bg->start);
 
 	/*
-	 * collect all data csums for the stripe to avoid seeking during
+	 * Collect all data csums for the stripe to avoid seeking during
 	 * the scrub. This might currently (crc32) end up to be about 1MB
 	 */
 	blk_start_plug(&plug);
 
 	if (sctx->is_dev_replace &&
-	    btrfs_dev_is_sequential(sctx->wr_tgtdev, physical)) {
+	    btrfs_dev_is_sequential(sctx->wr_tgtdev,
+				    map->stripes[stripe_index].physical)) {
 		mutex_lock(&sctx->wr_lock);
-		sctx->write_pointer = physical;
+		sctx->write_pointer = map->stripes[stripe_index].physical;
 		mutex_unlock(&sctx->wr_lock);
 		sctx->flush_all_writes = true;
 	}
@@ -3648,239 +3601,54 @@ static noinline_for_stack int scrub_stripe(struct scrub_ctx *sctx,
 
 	/* Only RAID56 goes through the old code */
 	ASSERT(map->type & BTRFS_BLOCK_GROUP_RAID56_MASK);
+
+	physical = map->stripes[stripe_index].physical;
+	offset = 0;
+	nstripes = div64_u64(dev_extent_len, map->stripe_len);
+	get_raid56_logic_offset(physical, stripe_index, map, &offset, NULL);
+	increment = map->stripe_len * nr_data_stripes(map);
+
+	logical = chunk_logical + offset;
+	physical_end = physical + nstripes * map->stripe_len;
+	get_raid56_logic_offset(physical_end, stripe_index, map, &logic_end,
+				NULL);
+	logic_end += chunk_logical;
+
 	ret = 0;
+	/*
+	 * Due to the rotation, for RAID56 it's better to iterate each stripe
+	 * using their physical offset.
+	 */
 	while (physical < physical_end) {
-		/*
-		 * canceled?
-		 */
-		if (atomic_read(&fs_info->scrub_cancel_req) ||
-		    atomic_read(&sctx->cancel_req)) {
-			ret = -ECANCELED;
-			goto out;
+		ret = get_raid56_logic_offset(physical, stripe_index, map,
+					      &logical, &stripe_logical);
+		logical += chunk_logical;
+		if (ret) {
+			/* It is parity strip */
+			stripe_logical += chunk_logical;
+			stripe_end = stripe_logical + increment;
+			ret = scrub_raid56_parity(sctx, map, scrub_dev,
+						  stripe_logical,
+						  stripe_end);
+			if (ret)
+				goto out;
+			goto next;
 		}
+
 		/*
-		 * check to see if we have to pause
+		 * Now we're at data stripes, scrub each extents in the range.
+		 *
+		 * At this stage, if we ignore the repair part, each data stripe
+		 * is no different than SINGLE profile.
+		 * We can reuse scrub_simple_mirror() here, as the repair part
+		 * is still based on @mirror_num.
 		 */
-		if (atomic_read(&fs_info->scrub_pause_req)) {
-			/* push queued extents */
-			sctx->flush_all_writes = true;
-			scrub_submit(sctx);
-			mutex_lock(&sctx->wr_lock);
-			scrub_wr_submit(sctx);
-			mutex_unlock(&sctx->wr_lock);
-			wait_event(sctx->list_wait,
-				   atomic_read(&sctx->bios_in_flight) == 0);
-			sctx->flush_all_writes = false;
-			scrub_blocked_if_needed(fs_info);
-		}
-
-		if (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) {
-			ret = get_raid56_logic_offset(physical, stripe_index,
-						      map, &logical,
-						      &stripe_logical);
-			logical += chunk_logical;
-			if (ret) {
-				/* it is parity strip */
-				stripe_logical += chunk_logical;
-				stripe_end = stripe_logical + increment;
-				ret = scrub_raid56_parity(sctx, map, scrub_dev,
-							  stripe_logical,
-							  stripe_end);
-				if (ret)
-					goto out;
-				goto skip;
-			}
-		}
-
-		if (btrfs_fs_incompat(fs_info, SKINNY_METADATA))
-			key.type = BTRFS_METADATA_ITEM_KEY;
-		else
-			key.type = BTRFS_EXTENT_ITEM_KEY;
-		key.objectid = logical;
-		key.offset = (u64)-1;
-
-		ret = btrfs_search_slot(NULL, root, &key, path, 0, 0);
+		ret = scrub_simple_mirror(sctx, root, csum_root, bg, map,
+					  logical, map->stripe_len,
+					  scrub_dev, physical, 1);
 		if (ret < 0)
 			goto out;
-
-		if (ret > 0) {
-			ret = btrfs_previous_extent_item(root, path, 0);
-			if (ret < 0)
-				goto out;
-			if (ret > 0) {
-				/* there's no smaller item, so stick with the
-				 * larger one */
-				btrfs_release_path(path);
-				ret = btrfs_search_slot(NULL, root, &key,
-							path, 0, 0);
-				if (ret < 0)
-					goto out;
-			}
-		}
-
-		stop_loop = 0;
-		while (1) {
-			u64 bytes;
-
-			l = path->nodes[0];
-			slot = path->slots[0];
-			if (slot >= btrfs_header_nritems(l)) {
-				ret = btrfs_next_leaf(root, path);
-				if (ret == 0)
-					continue;
-				if (ret < 0)
-					goto out;
-
-				stop_loop = 1;
-				break;
-			}
-			btrfs_item_key_to_cpu(l, &key, slot);
-
-			if (key.type != BTRFS_EXTENT_ITEM_KEY &&
-			    key.type != BTRFS_METADATA_ITEM_KEY)
-				goto next;
-
-			if (key.type == BTRFS_METADATA_ITEM_KEY)
-				bytes = fs_info->nodesize;
-			else
-				bytes = key.offset;
-
-			if (key.objectid + bytes <= logical)
-				goto next;
-
-			if (key.objectid >= logical + map->stripe_len) {
-				/* out of this device extent */
-				if (key.objectid >= logic_end)
-					stop_loop = 1;
-				break;
-			}
-
-			/*
-			 * If our block group was removed in the meanwhile, just
-			 * stop scrubbing since there is no point in continuing.
-			 * Continuing would prevent reusing its device extents
-			 * for new block groups for a long time.
-			 */
-			spin_lock(&bg->lock);
-			if (bg->removed) {
-				spin_unlock(&bg->lock);
-				ret = 0;
-				goto out;
-			}
-			spin_unlock(&bg->lock);
-
-			extent = btrfs_item_ptr(l, slot,
-						struct btrfs_extent_item);
-			flags = btrfs_extent_flags(l, extent);
-			generation = btrfs_extent_generation(l, extent);
-
-			if ((flags & BTRFS_EXTENT_FLAG_TREE_BLOCK) &&
-			    (key.objectid < logical ||
-			     key.objectid + bytes >
-			     logical + map->stripe_len)) {
-				btrfs_err(fs_info,
-					   "scrub: tree block %llu spanning stripes, ignored. logical=%llu",
-				       key.objectid, logical);
-				spin_lock(&sctx->stat_lock);
-				sctx->stat.uncorrectable_errors++;
-				spin_unlock(&sctx->stat_lock);
-				goto next;
-			}
-
-again:
-			extent_logical = key.objectid;
-			ASSERT(bytes <= U32_MAX);
-			extent_len = bytes;
-
-			/*
-			 * trim extent to this stripe
-			 */
-			if (extent_logical < logical) {
-				extent_len -= logical - extent_logical;
-				extent_logical = logical;
-			}
-			if (extent_logical + extent_len >
-			    logical + map->stripe_len) {
-				extent_len = logical + map->stripe_len -
-					     extent_logical;
-			}
-
-			extent_physical = extent_logical - logical + physical;
-			extent_dev = scrub_dev;
-			extent_mirror_num = mirror_num;
-			if (sctx->is_dev_replace)
-				scrub_remap_extent(fs_info, extent_logical,
-						   extent_len, &extent_physical,
-						   &extent_dev,
-						   &extent_mirror_num);
-
-			if (flags & BTRFS_EXTENT_FLAG_DATA) {
-				ret = btrfs_lookup_csums_range(csum_root,
-						extent_logical,
-						extent_logical + extent_len - 1,
-						&sctx->csum_list, 1);
-				if (ret)
-					goto out;
-			}
-
-			ret = scrub_extent(sctx, map, extent_logical, extent_len,
-					   extent_physical, extent_dev, flags,
-					   generation, extent_mirror_num,
-					   extent_logical - logical + physical);
-
-			scrub_free_csums(sctx);
-
-			if (ret)
-				goto out;
-
-			if (sctx->is_dev_replace)
-				sync_replace_for_zoned(sctx);
-
-			if (extent_logical + extent_len <
-			    key.objectid + bytes) {
-				if (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) {
-					/*
-					 * loop until we find next data stripe
-					 * or we have finished all stripes.
-					 */
-loop:
-					physical += map->stripe_len;
-					ret = get_raid56_logic_offset(physical,
-							stripe_index, map,
-							&logical, &stripe_logical);
-					logical += chunk_logical;
-
-					if (ret && physical < physical_end) {
-						stripe_logical += chunk_logical;
-						stripe_end = stripe_logical +
-								increment;
-						ret = scrub_raid56_parity(sctx,
-							map, scrub_dev,
-							stripe_logical,
-							stripe_end);
-						if (ret)
-							goto out;
-						goto loop;
-					}
-				} else {
-					physical += map->stripe_len;
-					logical += increment;
-				}
-				if (logical < key.objectid + bytes) {
-					cond_resched();
-					goto again;
-				}
-
-				if (physical >= physical_end) {
-					stop_loop = 1;
-					break;
-				}
-			}
 next:
-			path->slots[0]++;
-		}
-		btrfs_release_path(path);
-skip:
 		logical += increment;
 		physical += map->stripe_len;
 		spin_lock(&sctx->stat_lock);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/4] btrfs: refactor scrub entrances for each profile
  2022-01-07  2:34 [PATCH 0/4] btrfs: refactor scrub entrances for each profile Qu Wenruo
                   ` (3 preceding siblings ...)
  2022-01-07  2:34 ` [PATCH 4/4] btrfs: use scrub_simple_mirror() to handle RAID56 data stripe scrub Qu Wenruo
@ 2022-01-07 14:44 ` David Sterba
  2022-01-19  5:52 ` Qu Wenruo
  5 siblings, 0 replies; 10+ messages in thread
From: David Sterba @ 2022-01-07 14:44 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Fri, Jan 07, 2022 at 10:34:26AM +0800, Qu Wenruo wrote:
> The branch is based on several recent submitted small cleanups, thus
> it's better to fetch the branch from github:
> 
> https://github.com/adam900710/linux/tree/refactor_scrub
> 
> [CRAP-BUT-IT-WORKS(TM)]
> 
> Scrub is one of the area we seldom touch because:
> 
> - It's a mess
>   Just check scrub_stripe() function.
>   It's a function scrubbing a stripe for *all* profiles.
> 
>   It's near 400 lines for a single complex function, with double while()
>   loop and several different jumps inside the loop.
> 
>   Not to mention the lack of comments for various structures.
> 
>   This should and will never happen under our current code standard.
> 
> - It just works
>   I have hit more than 10 bugs during development, and I just want to
>   give up the refactor, as even the code is crap, it works, passing the
>   existing scrub/replace group.
>   While no matter how small code change I'm doing, it always fails to pass
>   the same tests.
> 
> [REFACTOR-IDEA]
> 
> The core idea here, is to get rid of one-fit-all solution for
> scrub_stripe().
> 
> Instead, we explicitly separate the scrub into 3 groups (the idea is
> from my btrfs-fuse project):
> 
> - Simple-mirror based profiles
>   This includes SINGLE/DUP/RAID1/RAID1C* profiles.
>   They have no stripe, and their repair is purely mirror based.
> 
> - Simple-stripe based profiles
>   This includes RAID0/RAID10 profiles.
>   They are just simple stripe (without P/Q nor rotation), with extra
>   mirrors to support their repair.
> 
> - RAID56
>   The most complex profiles, they have extra P/Q, and have rotation.
> 
> [REFACTOR-IMPLEMENTATION]
> 
> So we have 3 entrances for all those supported profiles:
> 
> - scrub_simple_mirror()
>   For SINGLE/DUP/RAID1/RAID1C* profiles.
>   Just go through each extent and scrub the extent.
> 
> - scrub_simple_stripe()
>   For RAID0/RAID10 profiles.
>   Instead we go each data stripe first, then inside each data stripe, we
>   can call scrub_simple_mirror(), since after stripe split, RAID0 is
>   just SINGLE and RAID10 is just RAID1.
> 
> - scrub_stripe() untouched for RAID56
>   RAID56 still has all the complex things to do, but they can still be
>   split into two types (already done by the original code)
> 
>   * data stripes
>     They are no different in the verification part, RAID56 is just
>     SINGLE if we ignore the repair path.
>     It's only in repair path that our path divides.
> 
>     So we can reuse scrub_simple_mirror() again.
> 
>   * P/Q stripes
>     They already have a dedicated function handling the case.
> 
> With all these refactors, although we have several more functions, we
> get rid of:
> 
> - A double while () loop
> - Several jumps inside the double loop
> - Complex calculation to try to fit all profiles
> 
> And we get:
> 
> - Better comments
> - More dedicated functions
> - A better basis for further refactors
> 
> [INCOMING CLEANUPS]
> 
> - Use find_first_extent_item() to cleanup the RAID56 code
>   This part itself can be as large this patchset already, thus will be
>   in its own patchset.
> 
> - Refactor scrub_pages/scrub_parity/... structures

I've paged through the patches and read the above to get an overall
idea, yeah, I think we have to do some radical cuts in the scrub code
along the lines you suggest. Small incremental changes don't seem to be
feasible when the design has to be changed. Splitting by the redundancy
and parity of the profile sounds reasonable, we have the definitions in
the raid table so no more enumerations by profile etc.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/4] btrfs: refactor scrub entrances for each profile
  2022-01-07  2:34 [PATCH 0/4] btrfs: refactor scrub entrances for each profile Qu Wenruo
                   ` (4 preceding siblings ...)
  2022-01-07 14:44 ` [PATCH 0/4] btrfs: refactor scrub entrances for each profile David Sterba
@ 2022-01-19  5:52 ` Qu Wenruo
  2022-01-24 20:24   ` David Sterba
  5 siblings, 1 reply; 10+ messages in thread
From: Qu Wenruo @ 2022-01-19  5:52 UTC (permalink / raw)
  To: linux-btrfs, David Sterba

Hi David,

Any plan to merge this patchset?

And if you need, I can merge all these refactors into one branch for you 
to fetch.

I know this sounds terrifying especially after the autodefrag "refactor" 
disaster, but unlike autodefrag, scrub/replace has existing test case 
coverage and this time it's really refactor, no behavior change (at 
least no intentional one).

I hope to get a stable base for the incoming scrub_page/scrub_bio 
structure cleanup.

As there is some plan to make scrub to use page::index for those 
anonymous pages, so they don't need to rely on bi_private to get their 
logical bytenr, and hopefully just one structure, scrub_range, to 
replace the scrub_page/scrub_bio things.

Thanks,
Qu

On 2022/1/7 10:34, Qu Wenruo wrote:
> The branch is based on several recent submitted small cleanups, thus
> it's better to fetch the branch from github:
> 
> https://github.com/adam900710/linux/tree/refactor_scrub
> 
> [CRAP-BUT-IT-WORKS(TM)]
> 
> Scrub is one of the area we seldom touch because:
> 
> - It's a mess
>    Just check scrub_stripe() function.
>    It's a function scrubbing a stripe for *all* profiles.
> 
>    It's near 400 lines for a single complex function, with double while()
>    loop and several different jumps inside the loop.
> 
>    Not to mention the lack of comments for various structures.
> 
>    This should and will never happen under our current code standard.
> 
> - It just works
>    I have hit more than 10 bugs during development, and I just want to
>    give up the refactor, as even the code is crap, it works, passing the
>    existing scrub/replace group.
>    While no matter how small code change I'm doing, it always fails to pass
>    the same tests.
> 
> [REFACTOR-IDEA]
> 
> The core idea here, is to get rid of one-fit-all solution for
> scrub_stripe().
> 
> Instead, we explicitly separate the scrub into 3 groups (the idea is
> from my btrfs-fuse project):
> 
> - Simple-mirror based profiles
>    This includes SINGLE/DUP/RAID1/RAID1C* profiles.
>    They have no stripe, and their repair is purely mirror based.
> 
> - Simple-stripe based profiles
>    This includes RAID0/RAID10 profiles.
>    They are just simple stripe (without P/Q nor rotation), with extra
>    mirrors to support their repair.
> 
> - RAID56
>    The most complex profiles, they have extra P/Q, and have rotation.
> 
> [REFACTOR-IMPLEMENTATION]
> 
> So we have 3 entrances for all those supported profiles:
> 
> - scrub_simple_mirror()
>    For SINGLE/DUP/RAID1/RAID1C* profiles.
>    Just go through each extent and scrub the extent.
> 
> - scrub_simple_stripe()
>    For RAID0/RAID10 profiles.
>    Instead we go each data stripe first, then inside each data stripe, we
>    can call scrub_simple_mirror(), since after stripe split, RAID0 is
>    just SINGLE and RAID10 is just RAID1.
> 
> - scrub_stripe() untouched for RAID56
>    RAID56 still has all the complex things to do, but they can still be
>    split into two types (already done by the original code)
> 
>    * data stripes
>      They are no different in the verification part, RAID56 is just
>      SINGLE if we ignore the repair path.
>      It's only in repair path that our path divides.
> 
>      So we can reuse scrub_simple_mirror() again.
> 
>    * P/Q stripes
>      They already have a dedicated function handling the case.
> 
> With all these refactors, although we have several more functions, we
> get rid of:
> 
> - A double while () loop
> - Several jumps inside the double loop
> - Complex calculation to try to fit all profiles
> 
> And we get:
> 
> - Better comments
> - More dedicated functions
> - A better basis for further refactors
> 
> [INCOMING CLEANUPS]
> 
> - Use find_first_extent_item() to cleanup the RAID56 code
>    This part itself can be as large this patchset already, thus will be
>    in its own patchset.
> 
> - Refactor scrub_pages/scrub_parity/... structures
> 
> Qu Wenruo (4):
>    btrfs: introduce a helper to locate an extent item
>    btrfs: introduce dedicated helper to scrub simple-mirror based range
>    btrfs: introduce dedicated helper to scrub simple-stripe based range
>    btrfs: use scrub_simple_mirror() to handle RAID56 data stripe scrub
> 
>   fs/btrfs/scrub.c | 702 ++++++++++++++++++++++++++++-------------------
>   1 file changed, 422 insertions(+), 280 deletions(-)
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/4] btrfs: refactor scrub entrances for each profile
  2022-01-19  5:52 ` Qu Wenruo
@ 2022-01-24 20:24   ` David Sterba
  2022-01-25  0:06     ` Qu Wenruo
  0 siblings, 1 reply; 10+ messages in thread
From: David Sterba @ 2022-01-24 20:24 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, David Sterba

On Wed, Jan 19, 2022 at 01:52:25PM +0800, Qu Wenruo wrote:
> Hi David,
> 
> Any plan to merge this patchset?
> 
> And if you need, I can merge all these refactors into one branch for you 
> to fetch.
> 
> I know this sounds terrifying especially after the autodefrag "refactor" 
> disaster, but unlike autodefrag, scrub/replace has existing test case 
> coverage and this time it's really refactor, no behavior change (at 
> least no intentional one).

Well, exactly because of the same change pattern that was done in the
defrag code I'm hesistant to merge it, even if there is test coverage.

Nevertheless, I will put it to for-next.

> I hope to get a stable base for the incoming scrub_page/scrub_bio 
> structure cleanup.
> 
> As there is some plan to make scrub to use page::index for those 
> anonymous pages, so they don't need to rely on bi_private to get their 
> logical bytenr, and hopefully just one structure, scrub_range, to 
> replace the scrub_page/scrub_bio things.

I don't think there's any other potential use of the page::index member
so feel free to use it.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/4] btrfs: refactor scrub entrances for each profile
  2022-01-24 20:24   ` David Sterba
@ 2022-01-25  0:06     ` Qu Wenruo
  2022-01-25  5:47       ` Qu Wenruo
  0 siblings, 1 reply; 10+ messages in thread
From: Qu Wenruo @ 2022-01-25  0:06 UTC (permalink / raw)
  To: dsterba, linux-btrfs



On 2022/1/25 04:24, David Sterba wrote:
> On Wed, Jan 19, 2022 at 01:52:25PM +0800, Qu Wenruo wrote:
>> Hi David,
>>
>> Any plan to merge this patchset?
>>
>> And if you need, I can merge all these refactors into one branch for you
>> to fetch.
>>
>> I know this sounds terrifying especially after the autodefrag "refactor"
>> disaster, but unlike autodefrag, scrub/replace has existing test case
>> coverage and this time it's really refactor, no behavior change (at
>> least no intentional one).
> 
> Well, exactly because of the same change pattern that was done in the
> defrag code I'm hesistant to merge it, even if there is test coverage.
> 
> Nevertheless, I will put it to for-next.

Feel free to do more tests.

And even this may sound like an excuse, the lack of defrag behavior 
definition and test coverage is contributing a lot to the defrag mess.

> 
>> I hope to get a stable base for the incoming scrub_page/scrub_bio
>> structure cleanup.
>>
>> As there is some plan to make scrub to use page::index for those
>> anonymous pages, so they don't need to rely on bi_private to get their
>> logical bytenr, and hopefully just one structure, scrub_range, to
>> replace the scrub_page/scrub_bio things.
> 
> I don't think there's any other potential use of the page::index member
> so feel free to use it.
> 
Thanks to the advice from Nik, I'm going to use page::private instead, 
which have some unexpected benefits, like allowing us to use just one 
64K page if the 64K extent is not page aligned.

Thanks,
Qu


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/4] btrfs: refactor scrub entrances for each profile
  2022-01-25  0:06     ` Qu Wenruo
@ 2022-01-25  5:47       ` Qu Wenruo
  0 siblings, 0 replies; 10+ messages in thread
From: Qu Wenruo @ 2022-01-25  5:47 UTC (permalink / raw)
  To: dsterba, linux-btrfs



On 2022/1/25 08:06, Qu Wenruo wrote:
> 
> 
> On 2022/1/25 04:24, David Sterba wrote:
>> On Wed, Jan 19, 2022 at 01:52:25PM +0800, Qu Wenruo wrote:
>>> Hi David,
>>>
>>> Any plan to merge this patchset?
>>>
>>> And if you need, I can merge all these refactors into one branch for you
>>> to fetch.
>>>
>>> I know this sounds terrifying especially after the autodefrag "refactor"
>>> disaster, but unlike autodefrag, scrub/replace has existing test case
>>> coverage and this time it's really refactor, no behavior change (at
>>> least no intentional one).
>>
>> Well, exactly because of the same change pattern that was done in the
>> defrag code I'm hesistant to merge it, even if there is test coverage.
>>
>> Nevertheless, I will put it to for-next.
> 
> Feel free to do more tests.
> 
> And even this may sound like an excuse, the lack of defrag behavior 
> definition and test coverage is contributing a lot to the defrag mess.

Well, Su reports a random crash for "[PATCH 1/2] btrfs: cleanup the 
argument list of scrub_chunk()", which adds a new ASSERT() to make sure 
the chunk_offset matches block_group cache, and gets triggered.

So I guess the series can wait for a while.

Thanks,
Qu
> 
>>
>>> I hope to get a stable base for the incoming scrub_page/scrub_bio
>>> structure cleanup.
>>>
>>> As there is some plan to make scrub to use page::index for those
>>> anonymous pages, so they don't need to rely on bi_private to get their
>>> logical bytenr, and hopefully just one structure, scrub_range, to
>>> replace the scrub_page/scrub_bio things.
>>
>> I don't think there's any other potential use of the page::index member
>> so feel free to use it.
>>
> Thanks to the advice from Nik, I'm going to use page::private instead, 
> which have some unexpected benefits, like allowing us to use just one 
> 64K page if the 64K extent is not page aligned.
> 
> Thanks,
> Qu
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-01-25  5:50 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-07  2:34 [PATCH 0/4] btrfs: refactor scrub entrances for each profile Qu Wenruo
2022-01-07  2:34 ` [PATCH 1/4] btrfs: introduce a helper to locate an extent item Qu Wenruo
2022-01-07  2:34 ` [PATCH 2/4] btrfs: introduce dedicated helper to scrub simple-mirror based range Qu Wenruo
2022-01-07  2:34 ` [PATCH 3/4] btrfs: introduce dedicated helper to scrub simple-stripe " Qu Wenruo
2022-01-07  2:34 ` [PATCH 4/4] btrfs: use scrub_simple_mirror() to handle RAID56 data stripe scrub Qu Wenruo
2022-01-07 14:44 ` [PATCH 0/4] btrfs: refactor scrub entrances for each profile David Sterba
2022-01-19  5:52 ` Qu Wenruo
2022-01-24 20:24   ` David Sterba
2022-01-25  0:06     ` Qu Wenruo
2022-01-25  5:47       ` Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).