[PATCH v3 0/2] btrfs: defrag: bring back the old file extent search behavior and address merged extent map generation problem

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v3 0/2] btrfs: defrag: bring back the old file extent search behavior and address merged extent map generation problem
@ 2022-02-11  6:46 Qu Wenruo
  2022-02-11  6:46 ` [PATCH v3 1/2] btrfs: defrag: bring back the old file extent search behavior Qu Wenruo
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Qu Wenruo @ 2022-02-11  6:46 UTC (permalink / raw)
  To: linux-btrfs

Filipe reported that the old defrag code using btrfs_search_forward() to
do the following optimization:

- Don't cache extent maps
  To save memory in the long run

- Skip entire file ranges which doesn't meet generation requirement

- Don't use merged extent maps which will have unreliable geneartion

The first patch will bring back the old behavior, along with the old
optimizations.

However the 3rd problem is not that easy to solve, as data
read/readahead can also load extent maps into the cache, and causing
extent maps being merged.

Such already cached and merged extent maps will still confuse autodefrag,
as if we found cached extent maps, we will not try to read them from
disk again.

So to completely prevent merged extent maps tricking autodefrag, here
comes the 2nd patch, to mark merged extent maps for defrag.

If we hit an merged extent, and its generation meets our requirement, we
will not trust it but read from disk to get a reliable generation.

This should reduce defrag IO caused by the hidden extent map merging
behavior.

Changelog:
v2:
- Make defrag_get_em() to be more flexiable to handle file extent
  iteartion
  Now it will not reject item key which is smaller than our target but
  doesn't have the wanted type/objectid.
  It will continue go next next instead, to prevent skipping an extent.

- Properly reduce path.slots[0]
  There is a bug where I want to put "if (path.slots[0] == 0)" but I put
  "if (btrfs_header_nritems(path.slots[0]))".
  This is fixed with reworked file extent iteration code.

- Address merged extent maps properly
  With fixed defrag_get_extent(), we can rely on it to get original em
  from disk.
  So what we need to do is just to ignore merged extents which meets
  our generation requirement.

v3:
- Rebased to latest misc-next

- Fix several generation spell typo

- Fix a case where btrfs_search_slot() can lead to path->slots[0] >=
  nritems

- Fix the commit message on modified extent map
  Now that part mentioning fsync() doesn't help on the autodefrag bug.

- Update the wording on extent map read from subvolume trees

Qu Wenruo (2):
  btrfs: defrag: bring back the old file extent search behavior
  btrfs: defrag: don't use merged extent map for their generation check

 fs/btrfs/extent_map.c |   2 +
 fs/btrfs/extent_map.h |   8 ++
 fs/btrfs/ioctl.c      | 174 +++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 180 insertions(+), 4 deletions(-)

-- 
2.35.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3 1/2] btrfs: defrag: bring back the old file extent search behavior
  2022-02-11  6:46 [PATCH v3 0/2] btrfs: defrag: bring back the old file extent search behavior and address merged extent map generation problem Qu Wenruo
@ 2022-02-11  6:46 ` Qu Wenruo
  2022-02-14 16:15   ` David Sterba
  2022-02-11  6:46 ` [PATCH v3 2/2] btrfs: defrag: don't use merged extent map for their generation check Qu Wenruo
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Qu Wenruo @ 2022-02-11  6:46 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Filipe Manana

For defrag, we don't really want to use btrfs_get_extent() to iterate
all extent maps of an inode.

The reasons are:

- btrfs_get_extent() can merge extent maps
  And the result em has the higher generation of the two, causing defrag
  to mark unnecessary part of such merged large extent map.

  This in fact can result extra IO for autodefrag in v5.16+ kernels.

  However this patch is not going to completely solve the problem, as
  one can still using read() to trigger extent map reading, and got
  them merged.

  The completely solution for the extent map merging generation problem
  will come as an standalone fix.

- btrfs_get_extent() caches the extent map result
  Normally it's fine, but for defrag the target range may not get
  another read/write for a long long time.
  Such cache would only increase the memory usage.

- btrfs_get_extent() doesn't skip older extent map
  Unlike the old find_new_extent() which uses btrfs_search_forward() to
  skip the older subtree, thus it will pick up unnecessary extent maps.

This patch will fix the regression by introducing defrag_get_extent() to
replace the btrfs_get_extent() call.

This helper will:

- Not cache the file extent we found
  It will search the file extent and manually convert it to em.

- Use btrfs_search_foward() to skip entire ranges which is modified in
  the past

This should reduce the IO for autodefrag.

Reported-by: Filipe Manana <fdmanana@suse.com>
Fixes: 7b508037d4ca ("btrfs: defrag: use defrag_one_cluster() to implement btrfs_defrag_file()")
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/ioctl.c | 160 +++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 156 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index e49f745af942..0e983b782968 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1017,8 +1017,154 @@ static noinline int btrfs_mksnapshot(const struct path *parent,
 	return ret;
 }
 
+/*
+ * Defrag specific helper to get an extent map.
+ *
+ * Differences between this and btrfs_get_extent() are:
+ * - No extent_map will be added to inode->extent_tree
+ *   To reduce memory usage in the long run.
+ *
+ * - Extra optimization to skip file extents older than @newer_than
+ *   By using btrfs_search_forward() we can skip entire file ranges that
+ *   have extents created in past transactions, because btrfs_search_forward()
+ *   will not visit leaves and nodes with a generation smaller than given
+ *   minimal generation threshold (@newer_than).
+ *
+ * Return valid em if we find a file extent matching the requirement.
+ * Return NULL if we can not find a file extent matching the requirement.
+ *
+ * Return ERR_PTR() for error.
+ */
+static struct extent_map *defrag_get_extent(struct btrfs_inode *inode,
+					    u64 start, u64 newer_than)
+{
+	struct btrfs_root *root = inode->root;
+	struct btrfs_file_extent_item *fi;
+	struct btrfs_path path = {};
+	struct extent_map *em;
+	struct btrfs_key key;
+	u64 ino = btrfs_ino(inode);
+	int ret;
+
+	em = alloc_extent_map();
+	if (!em) {
+		ret = -ENOMEM;
+		goto err;
+	}
+
+	key.objectid = ino;
+	key.type = BTRFS_EXTENT_DATA_KEY;
+	key.offset = start;
+
+	if (newer_than) {
+		ret = btrfs_search_forward(root, &key, &path, newer_than);
+		if (ret < 0)
+			goto err;
+		/* Can't find anything newer */
+		if (ret > 0)
+			goto not_found;
+	} else {
+		ret = btrfs_search_slot(NULL, root, &key, &path, 0, 0);
+		if (ret < 0)
+			goto err;
+	}
+	if (path.slots[0] >= btrfs_header_nritems(path.nodes[0])) {
+		/*
+		 * If btrfs_search_slot() makes path to point beyond nritems,
+		 * we should not have an empty leaf, as this inode must at
+		 * least have its INODE_ITEM.
+		 */
+		ASSERT(btrfs_header_nritems(path.nodes[0]));
+		path.slots[0] = btrfs_header_nritems(path.nodes[0]) - 1;
+	}
+	btrfs_item_key_to_cpu(path.nodes[0], &key, path.slots[0]);
+	/* Perfect match, no need to go one slot back */
+	if (key.objectid == ino && key.type == BTRFS_EXTENT_DATA_KEY &&
+	    key.offset == start)
+		goto iterate;
+
+	/* We didn't find a perfect match, needs to go one slot back */
+	if (path.slots[0] > 0) {
+		btrfs_item_key_to_cpu(path.nodes[0], &key, path.slots[0]);
+		if (key.objectid == ino && key.type == BTRFS_EXTENT_DATA_KEY)
+			path.slots[0]--;
+	}
+
+iterate:
+	/* Iterate through the path to find a file extent covering @start */
+	while (true) {
+		u64 extent_end;
+
+		if (path.slots[0] >= btrfs_header_nritems(path.nodes[0]))
+			goto next;
+
+		btrfs_item_key_to_cpu(path.nodes[0], &key, path.slots[0]);
+
+		/*
+		 * We may go one slot back to INODE_REF/XATTR item, then
+		 * need to go forward until we reach an EXTENT_DATA.
+		 * But we should still has the correct ino as key.objectid.
+		 */
+		if (WARN_ON(key.objectid < ino) || key.type < BTRFS_EXTENT_DATA_KEY)
+			goto next;
+
+		/* It's beyond our target range, definitely not extent found */
+		if (key.objectid > ino || key.type > BTRFS_EXTENT_DATA_KEY)
+			goto not_found;
+
+		/*
+		 *	|	|<- File extent ->|
+		 *	\- start
+		 *
+		 * This means there is a hole between start and key.offset.
+		 */
+		if (key.offset > start) {
+			em->start = start;
+			em->orig_start = start;
+			em->block_start = EXTENT_MAP_HOLE;
+			em->len = key.offset - start;
+			break;
+		}
+
+		fi = btrfs_item_ptr(path.nodes[0], path.slots[0],
+				    struct btrfs_file_extent_item);
+		extent_end = btrfs_file_extent_end(&path);
+
+		/*
+		 *	|<- file extent ->|	|
+		 *				\- start
+		 *
+		 * We haven't reach start, search next slot.
+		 */
+		if (extent_end <= start)
+			goto next;
+
+		/* Now this extent covers @start, convert it to em */
+		btrfs_extent_item_to_extent_map(inode, &path, fi, false, em);
+		break;
+next:
+		ret = btrfs_next_item(root, &path);
+		if (ret < 0)
+			goto err;
+		if (ret > 0)
+			goto not_found;
+	}
+	btrfs_release_path(&path);
+	return em;
+
+not_found:
+	btrfs_release_path(&path);
+	free_extent_map(em);
+	return NULL;
+
+err:
+	btrfs_release_path(&path);
+	free_extent_map(em);
+	return ERR_PTR(ret);
+}
+
 static struct extent_map *defrag_lookup_extent(struct inode *inode, u64 start,
-					       bool locked)
+					       u64 newer_than, bool locked)
 {
 	struct extent_map_tree *em_tree = &BTRFS_I(inode)->extent_tree;
 	struct extent_io_tree *io_tree = &BTRFS_I(inode)->io_tree;
@@ -1040,7 +1186,7 @@ static struct extent_map *defrag_lookup_extent(struct inode *inode, u64 start,
 		/* get the big lock and read metadata off disk */
 		if (!locked)
 			lock_extent_bits(io_tree, start, end, &cached);
-		em = btrfs_get_extent(BTRFS_I(inode), NULL, 0, start, sectorsize);
+		em = defrag_get_extent(BTRFS_I(inode), start, newer_than);
 		if (!locked)
 			unlock_extent_cached(io_tree, start, end, &cached);
 
@@ -1068,7 +1214,12 @@ static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
 	if (em->start + em->len >= i_size_read(inode))
 		return ret;
 
-	next = defrag_lookup_extent(inode, em->start + em->len, locked);
+	/*
+	 * We want to check if the next extent can be merged with the current
+	 * one, which can be an extent created in a past generation, so we pass
+	 * a minimum generation of 0 to defrag_lookup_extent().
+	 */
+	next = defrag_lookup_extent(inode, em->start + em->len, 0, locked);
 	/* No more em or hole */
 	if (!next || next->block_start >= EXTENT_MAP_LAST_BYTE)
 		goto out;
@@ -1216,7 +1367,8 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
 		u64 range_len;
 
 		last_is_target = false;
-		em = defrag_lookup_extent(&inode->vfs_inode, cur, locked);
+		em = defrag_lookup_extent(&inode->vfs_inode, cur,
+					  ctrl->newer_than, locked);
 		if (!em)
 			break;
 
-- 
2.35.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 2/2] btrfs: defrag: don't use merged extent map for their generation check
  2022-02-11  6:46 [PATCH v3 0/2] btrfs: defrag: bring back the old file extent search behavior and address merged extent map generation problem Qu Wenruo
  2022-02-11  6:46 ` [PATCH v3 1/2] btrfs: defrag: bring back the old file extent search behavior Qu Wenruo
@ 2022-02-11  6:46 ` Qu Wenruo
  2022-02-11 12:07 ` [PATCH v3 0/2] btrfs: defrag: bring back the old file extent search behavior and address merged extent map generation problem Filipe Manana
  2022-02-21 14:41 ` David Sterba
  3 siblings, 0 replies; 9+ messages in thread
From: Qu Wenruo @ 2022-02-11  6:46 UTC (permalink / raw)
  To: linux-btrfs

For extent maps, if they are not compressed extents and are adjacent by
logical addresses and file offsets, they can be merged into one larger
extent map.

Such merged extent map will have the higher generation of all the
original ones.

But this brings a problem for autodefrag, as it relies on accurate
extent_map::generation to determine if one extent should be defragged.

For merged extent maps, their higher generation can mark some older
extents to be defragged while the original extent map doesn't meet the
minimal generation threshold.

Thus this will cause extra IO.

So solve the problem, here we introduce a new flag, EXTENT_FLAG_MERGED, to
indicate if the extent map is merged from one or more ems.

And for autodefrag, if we find a merged extent map, and its generation
meets the generation requirement, we just don't use this one, and go
back to defrag_get_extent() to read extent maps from subvolume trees.

This could cause more read IO, but should result less defrag data write,
so in the long run it should be a win for autodefrag.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/extent_map.c |  2 ++
 fs/btrfs/extent_map.h |  8 ++++++++
 fs/btrfs/ioctl.c      | 14 ++++++++++++++
 3 files changed, 24 insertions(+)

diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c
index ba43303cb081..6fee14ce2e6b 100644
--- a/fs/btrfs/extent_map.c
+++ b/fs/btrfs/extent_map.c
@@ -261,6 +261,7 @@ static void try_merge_map(struct extent_map_tree *tree, struct extent_map *em)
 			em->mod_len = (em->mod_len + em->mod_start) - merge->mod_start;
 			em->mod_start = merge->mod_start;
 			em->generation = max(em->generation, merge->generation);
+			set_bit(EXTENT_FLAG_MERGED, &em->flags);
 
 			rb_erase_cached(&merge->rb_node, &tree->map);
 			RB_CLEAR_NODE(&merge->rb_node);
@@ -278,6 +279,7 @@ static void try_merge_map(struct extent_map_tree *tree, struct extent_map *em)
 		RB_CLEAR_NODE(&merge->rb_node);
 		em->mod_len = (merge->mod_start + merge->mod_len) - em->mod_start;
 		em->generation = max(em->generation, merge->generation);
+		set_bit(EXTENT_FLAG_MERGED, &em->flags);
 		free_extent_map(merge);
 	}
 }
diff --git a/fs/btrfs/extent_map.h b/fs/btrfs/extent_map.h
index 8e217337dff9..d2fa32ffe304 100644
--- a/fs/btrfs/extent_map.h
+++ b/fs/btrfs/extent_map.h
@@ -25,6 +25,8 @@ enum {
 	EXTENT_FLAG_FILLING,
 	/* filesystem extent mapping type */
 	EXTENT_FLAG_FS_MAPPING,
+	/* This em is merged from two or more physically adjacent ems */
+	EXTENT_FLAG_MERGED,
 };
 
 struct extent_map {
@@ -40,6 +42,12 @@ struct extent_map {
 	u64 ram_bytes;
 	u64 block_start;
 	u64 block_len;
+
+	/*
+	 * Generation of the extent map, for merged em it's the highest
+	 * generation of all merged ems.
+	 * For non-merged extents, it's from btrfs_file_extent_item::generation.
+	 */
 	u64 generation;
 	unsigned long flags;
 	/* Used for chunk mappings, flag EXTENT_FLAG_FS_MAPPING must be set */
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 0e983b782968..c04175ad1b07 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1179,6 +1179,20 @@ static struct extent_map *defrag_lookup_extent(struct inode *inode, u64 start,
 	em = lookup_extent_mapping(em_tree, start, sectorsize);
 	read_unlock(&em_tree->lock);
 
+	/*
+	 * We can get a merged extent, in that case, we need to re-search
+	 * tree to get the original em for defrag.
+	 *
+	 * If @newer_than is 0 or em::generation < newer_than, we can trust
+	 * this em, as either we don't care about the generation , or the
+	 * merged extent map will be rejected anyway.
+	 */
+	if (em && test_bit(EXTENT_FLAG_MERGED, &em->flags) &&
+	    newer_than && em->generation >= newer_than) {
+		free_extent_map(em);
+		em = NULL;
+	}
+
 	if (!em) {
 		struct extent_state *cached = NULL;
 		u64 end = start + sectorsize - 1;
-- 
2.35.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 0/2] btrfs: defrag: bring back the old file extent search behavior and address merged extent map generation problem
  2022-02-11  6:46 [PATCH v3 0/2] btrfs: defrag: bring back the old file extent search behavior and address merged extent map generation problem Qu Wenruo
  2022-02-11  6:46 ` [PATCH v3 1/2] btrfs: defrag: bring back the old file extent search behavior Qu Wenruo
  2022-02-11  6:46 ` [PATCH v3 2/2] btrfs: defrag: don't use merged extent map for their generation check Qu Wenruo
@ 2022-02-11 12:07 ` Filipe Manana
  2022-02-21 14:41 ` David Sterba
  3 siblings, 0 replies; 9+ messages in thread
From: Filipe Manana @ 2022-02-11 12:07 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Fri, Feb 11, 2022 at 02:46:11PM +0800, Qu Wenruo wrote:
> Filipe reported that the old defrag code using btrfs_search_forward() to
> do the following optimization:
> 
> - Don't cache extent maps
>   To save memory in the long run
> 
> - Skip entire file ranges which doesn't meet generation requirement
> 
> - Don't use merged extent maps which will have unreliable geneartion
> 
> The first patch will bring back the old behavior, along with the old
> optimizations.
> 
> However the 3rd problem is not that easy to solve, as data
> read/readahead can also load extent maps into the cache, and causing
> extent maps being merged.
> 
> Such already cached and merged extent maps will still confuse autodefrag,
> as if we found cached extent maps, we will not try to read them from
> disk again.
> 
> So to completely prevent merged extent maps tricking autodefrag, here
> comes the 2nd patch, to mark merged extent maps for defrag.
> 
> If we hit an merged extent, and its generation meets our requirement, we
> will not trust it but read from disk to get a reliable generation.
> 
> This should reduce defrag IO caused by the hidden extent map merging
> behavior.
> 
> Changelog:
> v2:
> - Make defrag_get_em() to be more flexiable to handle file extent
>   iteartion
>   Now it will not reject item key which is smaller than our target but
>   doesn't have the wanted type/objectid.
>   It will continue go next next instead, to prevent skipping an extent.
> 
> - Properly reduce path.slots[0]
>   There is a bug where I want to put "if (path.slots[0] == 0)" but I put
>   "if (btrfs_header_nritems(path.slots[0]))".
>   This is fixed with reworked file extent iteration code.
> 
> - Address merged extent maps properly
>   With fixed defrag_get_extent(), we can rely on it to get original em
>   from disk.
>   So what we need to do is just to ignore merged extents which meets
>   our generation requirement.
> 
> v3:
> - Rebased to latest misc-next
> 
> - Fix several generation spell typo
> 
> - Fix a case where btrfs_search_slot() can lead to path->slots[0] >=
>   nritems
> 
> - Fix the commit message on modified extent map
>   Now that part mentioning fsync() doesn't help on the autodefrag bug.
> 
> - Update the wording on extent map read from subvolume trees
> 
> Qu Wenruo (2):
>   btrfs: defrag: bring back the old file extent search behavior
>   btrfs: defrag: don't use merged extent map for their generation check

Ok, for the both patches:

Reviewed-by: Filipe Manana <fdmanana@suse.com>

Thanks.

> 
>  fs/btrfs/extent_map.c |   2 +
>  fs/btrfs/extent_map.h |   8 ++
>  fs/btrfs/ioctl.c      | 174 +++++++++++++++++++++++++++++++++++++++++-
>  3 files changed, 180 insertions(+), 4 deletions(-)
> 
> -- 
> 2.35.0
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 1/2] btrfs: defrag: bring back the old file extent search behavior
  2022-02-11  6:46 ` [PATCH v3 1/2] btrfs: defrag: bring back the old file extent search behavior Qu Wenruo
@ 2022-02-14 16:15   ` David Sterba
  2022-02-15  0:02     ` Qu Wenruo
  0 siblings, 1 reply; 9+ messages in thread
From: David Sterba @ 2022-02-14 16:15 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, Filipe Manana

On Fri, Feb 11, 2022 at 02:46:12PM +0800, Qu Wenruo wrote:
> @@ -1216,7 +1367,8 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
>  		u64 range_len;
>  
>  		last_is_target = false;
> -		em = defrag_lookup_extent(&inode->vfs_inode, cur, locked);
> +		em = defrag_lookup_extent(&inode->vfs_inode, cur,
> +					  ctrl->newer_than, locked);

This uses the ctrl structure, if this is also supposed to go to 5.16
please provide a version that applies, thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 1/2] btrfs: defrag: bring back the old file extent search behavior
  2022-02-14 16:15   ` David Sterba
@ 2022-02-15  0:02     ` Qu Wenruo
  2022-02-21 17:22       ` David Sterba
  0 siblings, 1 reply; 9+ messages in thread
From: Qu Wenruo @ 2022-02-15  0:02 UTC (permalink / raw)
  To: dsterba, linux-btrfs, Filipe Manana



On 2022/2/15 00:15, David Sterba wrote:
> On Fri, Feb 11, 2022 at 02:46:12PM +0800, Qu Wenruo wrote:
>> @@ -1216,7 +1367,8 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
>>   		u64 range_len;
>>   
>>   		last_is_target = false;
>> -		em = defrag_lookup_extent(&inode->vfs_inode, cur, locked);
>> +		em = defrag_lookup_extent(&inode->vfs_inode, cur,
>> +					  ctrl->newer_than, locked);
> 
> This uses the ctrl structure, if this is also supposed to go to 5.16
> please provide a version that applies, thanks.
> 

The conflicts are already smaller enough for this patchset and later 
autodefrag work.

I can easily do a manual backport for v5.16.

Thanks,
Qu


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 0/2] btrfs: defrag: bring back the old file extent search behavior and address merged extent map generation problem
  2022-02-11  6:46 [PATCH v3 0/2] btrfs: defrag: bring back the old file extent search behavior and address merged extent map generation problem Qu Wenruo
                   ` (2 preceding siblings ...)
  2022-02-11 12:07 ` [PATCH v3 0/2] btrfs: defrag: bring back the old file extent search behavior and address merged extent map generation problem Filipe Manana
@ 2022-02-21 14:41 ` David Sterba
  3 siblings, 0 replies; 9+ messages in thread
From: David Sterba @ 2022-02-21 14:41 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Fri, Feb 11, 2022 at 02:46:11PM +0800, Qu Wenruo wrote:
> Filipe reported that the old defrag code using btrfs_search_forward() to
> do the following optimization:
> 
> - Don't cache extent maps
>   To save memory in the long run
> 
> - Skip entire file ranges which doesn't meet generation requirement
> 
> - Don't use merged extent maps which will have unreliable geneartion
> 
> The first patch will bring back the old behavior, along with the old
> optimizations.
> 
> However the 3rd problem is not that easy to solve, as data
> read/readahead can also load extent maps into the cache, and causing
> extent maps being merged.
> 
> Such already cached and merged extent maps will still confuse autodefrag,
> as if we found cached extent maps, we will not try to read them from
> disk again.
> 
> So to completely prevent merged extent maps tricking autodefrag, here
> comes the 2nd patch, to mark merged extent maps for defrag.
> 
> If we hit an merged extent, and its generation meets our requirement, we
> will not trust it but read from disk to get a reliable generation.
> 
> This should reduce defrag IO caused by the hidden extent map merging
> behavior.
> 
> Changelog:
> v2:
> - Make defrag_get_em() to be more flexiable to handle file extent
>   iteartion
>   Now it will not reject item key which is smaller than our target but
>   doesn't have the wanted type/objectid.
>   It will continue go next next instead, to prevent skipping an extent.
> 
> - Properly reduce path.slots[0]
>   There is a bug where I want to put "if (path.slots[0] == 0)" but I put
>   "if (btrfs_header_nritems(path.slots[0]))".
>   This is fixed with reworked file extent iteration code.
> 
> - Address merged extent maps properly
>   With fixed defrag_get_extent(), we can rely on it to get original em
>   from disk.
>   So what we need to do is just to ignore merged extents which meets
>   our generation requirement.
> 
> v3:
> - Rebased to latest misc-next
> 
> - Fix several generation spell typo
> 
> - Fix a case where btrfs_search_slot() can lead to path->slots[0] >=
>   nritems
> 
> - Fix the commit message on modified extent map
>   Now that part mentioning fsync() doesn't help on the autodefrag bug.
> 
> - Update the wording on extent map read from subvolume trees
> 
> Qu Wenruo (2):
>   btrfs: defrag: bring back the old file extent search behavior
>   btrfs: defrag: don't use merged extent map for their generation check

Added to misc-next, thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 1/2] btrfs: defrag: bring back the old file extent search behavior
  2022-02-15  0:02     ` Qu Wenruo
@ 2022-02-21 17:22       ` David Sterba
  2022-02-22  0:05         ` Qu Wenruo
  0 siblings, 1 reply; 9+ messages in thread
From: David Sterba @ 2022-02-21 17:22 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: dsterba, linux-btrfs, Filipe Manana

On Tue, Feb 15, 2022 at 08:02:35AM +0800, Qu Wenruo wrote:
> 
> 
> On 2022/2/15 00:15, David Sterba wrote:
> > On Fri, Feb 11, 2022 at 02:46:12PM +0800, Qu Wenruo wrote:
> >> @@ -1216,7 +1367,8 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
> >>   		u64 range_len;
> >>   
> >>   		last_is_target = false;
> >> -		em = defrag_lookup_extent(&inode->vfs_inode, cur, locked);
> >> +		em = defrag_lookup_extent(&inode->vfs_inode, cur,
> >> +					  ctrl->newer_than, locked);
> > 
> > This uses the ctrl structure, if this is also supposed to go to 5.16
> > please provide a version that applies, thanks.
> > 
> 
> The conflicts are already smaller enough for this patchset and later 
> autodefrag work.
> 
> I can easily do a manual backport for v5.16.

So the change is to use newer_than instead of ctrl->newer_then, right?
I'll use that so it does not depend on the ctrl patches.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 1/2] btrfs: defrag: bring back the old file extent search behavior
  2022-02-21 17:22       ` David Sterba
@ 2022-02-22  0:05         ` Qu Wenruo
  0 siblings, 0 replies; 9+ messages in thread
From: Qu Wenruo @ 2022-02-22  0:05 UTC (permalink / raw)
  To: dsterba, Qu Wenruo, linux-btrfs, Filipe Manana



On 2022/2/22 01:22, David Sterba wrote:
> On Tue, Feb 15, 2022 at 08:02:35AM +0800, Qu Wenruo wrote:
>>
>>
>> On 2022/2/15 00:15, David Sterba wrote:
>>> On Fri, Feb 11, 2022 at 02:46:12PM +0800, Qu Wenruo wrote:
>>>> @@ -1216,7 +1367,8 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
>>>>    		u64 range_len;
>>>>
>>>>    		last_is_target = false;
>>>> -		em = defrag_lookup_extent(&inode->vfs_inode, cur, locked);
>>>> +		em = defrag_lookup_extent(&inode->vfs_inode, cur,
>>>> +					  ctrl->newer_than, locked);
>>>
>>> This uses the ctrl structure, if this is also supposed to go to 5.16
>>> please provide a version that applies, thanks.
>>>
>>
>> The conflicts are already smaller enough for this patchset and later
>> autodefrag work.
>>
>> I can easily do a manual backport for v5.16.
>
> So the change is to use newer_than instead of ctrl->newer_then, right?
> I'll use that so it does not depend on the ctrl patches.

Yes. But please keep in mind that there are some special hacks used in
the original btrfs_ioctl_defrag_range_args uses its @start as
btrfs_defrag_ctrl::last_scanned.

Which sometimes can be confusing for autodefrag fixes.

Thanks,
Qu

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-02-22  0:05 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-11  6:46 [PATCH v3 0/2] btrfs: defrag: bring back the old file extent search behavior and address merged extent map generation problem Qu Wenruo
2022-02-11  6:46 ` [PATCH v3 1/2] btrfs: defrag: bring back the old file extent search behavior Qu Wenruo
2022-02-14 16:15   ` David Sterba
2022-02-15  0:02     ` Qu Wenruo
2022-02-21 17:22       ` David Sterba
2022-02-22  0:05         ` Qu Wenruo
2022-02-11  6:46 ` [PATCH v3 2/2] btrfs: defrag: don't use merged extent map for their generation check Qu Wenruo
2022-02-11 12:07 ` [PATCH v3 0/2] btrfs: defrag: bring back the old file extent search behavior and address merged extent map generation problem Filipe Manana
2022-02-21 14:41 ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.