All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 1/3] btrfs: defrag: don't try to merge regular extents with preallocated extents
@ 2022-01-26  0:58 Qu Wenruo
  2022-01-26  0:58 ` [PATCH 2/3] btrfs: defrag: use extent_thresh to replace the hardcoded size limit Qu Wenruo
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Qu Wenruo @ 2022-01-26  0:58 UTC (permalink / raw)
  To: linux-btrfs

[BUG]
With older kernels (before v5.16), btrfs will defrag preallocated extents.
While with newer kernels (v5.16 and newer) btrfs will not defrag
preallocated extents, but it will defrag the extent just before the
preallocated extent, even it's just a single sector.

This can be exposed by the following small script:

	mkfs.btrfs -f $dev > /dev/null

	mount $dev $mnt
	xfs_io -f -c "pwrite 0 4k" -c sync -c "falloc 4k 16K" $mnt/file
	xfs_io -c "fiemap -v" $mnt/file
	btrfs fi defrag $mnt/file
	sync
	xfs_io -c "fiemap -v" $mnt/file

The output looks like this on older kernels:

/mnt/btrfs/file:
 EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
   0: [0..7]:          26624..26631         8   0x0
   1: [8..39]:         26632..26663        32 0x801
/mnt/btrfs/file:
 EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
   0: [0..39]:         26664..26703        40   0x1

Which defrags the single sector along with the preallocated extent, and
replace them with an regular extent into a new location (caused by data
COW).
This wastes most of the data IO just for the preallocated range.

On the other hand, v5.16 is slightly better:

/mnt/btrfs/file:
 EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
   0: [0..7]:          26624..26631         8   0x0
   1: [8..39]:         26632..26663        32 0x801
/mnt/btrfs/file:
 EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
   0: [0..7]:          26664..26671         8   0x0
   1: [8..39]:         26632..26663        32 0x801

The preallocated range is not defragged, but the sector before it still
gets defragged, which has no need for it.

[CAUSE]
One of the function reused by the old and new behavior is
defrag_check_next_extent(), it will determine if we should defrag
current extent by checking the next one.

It only checks if the next extent is a hole or inlined, but it doesn't
check if it's preallocated.

On the other hand, out of the function, both old and new kernel will
reject preallocated extents.

Such inconsistent behavior causes above behavior.

[FIX]
- Also check if next extent is preallocated
  If so, don't defrag current extent.

- Add comments for each branch why we reject the extent

This will reduce the IO caused by defrag ioctl and autodefrag.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
Changelog:
v2:
- Use @extent_thresh from caller to replace the harded coded threshold
  Now caller has full control over the extent threshold value.

- Remove the old ambiguous check based on physical address
  The original check is too specific, only reject extents which are
  physically adjacent, AND too large.
  Since we have correct size check now, and the physically adjacent check
  is not always a win.
  So remove the old check completely.

v3:
- Split the @extent_thresh and physicall adjacent check into other
  patches

- Simplify the comment
---
 fs/btrfs/ioctl.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 91ba2efe9792..0d8bfc716e6b 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1053,19 +1053,25 @@ static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
 				     bool locked)
 {
 	struct extent_map *next;
-	bool ret = true;
+	bool ret = false;
 
 	/* this is the last extent */
 	if (em->start + em->len >= i_size_read(inode))
-		return false;
+		return ret;
 
 	next = defrag_lookup_extent(inode, em->start + em->len, locked);
+	/* No more em or hole */
 	if (!next || next->block_start >= EXTENT_MAP_LAST_BYTE)
-		ret = false;
-	else if ((em->block_start + em->block_len == next->block_start) &&
-		 (em->block_len > SZ_128K && next->block_len > SZ_128K))
-		ret = false;
-
+		goto out;
+	/* Preallocated */
+	if (test_bit(EXTENT_FLAG_PREALLOC, &em->flags))
+		goto out;
+	/* Physically adjacent and large enough */
+	if ((em->block_start + em->block_len == next->block_start) &&
+	    (em->block_len > SZ_128K && next->block_len > SZ_128K))
+		goto out;
+	ret = true;
+out:
 	free_extent_map(next);
 	return ret;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/3] btrfs: defrag: use extent_thresh to replace the hardcoded size limit
  2022-01-26  0:58 [PATCH v3 1/3] btrfs: defrag: don't try to merge regular extents with preallocated extents Qu Wenruo
@ 2022-01-26  0:58 ` Qu Wenruo
  2022-01-26 11:40   ` Filipe Manana
  2022-01-26  0:58 ` [PATCH 3/3] btrfs: defrag: remove the physical adjacent extents rejection in defrag_check_next_extent() Qu Wenruo
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 16+ messages in thread
From: Qu Wenruo @ 2022-01-26  0:58 UTC (permalink / raw)
  To: linux-btrfs

In defrag_lookup_extent() we use hardcoded extent size threshold, SZ_128K,
other than @extent_thresh in btrfs_defrag_file().

This can lead to some inconsistent behavior, especially the default
extent size threshold is 256K.

Fix this by passing @extent_thresh into defrag_check_next_extent() and
use that value.

Also, since the extent_thresh check should be applied to all extents,
not only physically adjacent extents, move the threshold check into a
dedicate if ().

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/ioctl.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 0d8bfc716e6b..2911df12fc48 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1050,7 +1050,7 @@ static struct extent_map *defrag_lookup_extent(struct inode *inode, u64 start,
 }
 
 static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
-				     bool locked)
+				     u32 extent_thresh, bool locked)
 {
 	struct extent_map *next;
 	bool ret = false;
@@ -1066,9 +1066,11 @@ static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
 	/* Preallocated */
 	if (test_bit(EXTENT_FLAG_PREALLOC, &em->flags))
 		goto out;
-	/* Physically adjacent and large enough */
-	if ((em->block_start + em->block_len == next->block_start) &&
-	    (em->block_len > SZ_128K && next->block_len > SZ_128K))
+	/* Extent is already large enough */
+	if (next->len >= extent_thresh)
+		goto out;
+	/* Physically adjacent */
+	if ((em->block_start + em->block_len == next->block_start))
 		goto out;
 	ret = true;
 out:
@@ -1231,7 +1233,7 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
 			goto next;
 
 		next_mergeable = defrag_check_next_extent(&inode->vfs_inode, em,
-							  locked);
+							  extent_thresh, locked);
 		if (!next_mergeable) {
 			struct defrag_target_range *last;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/3] btrfs: defrag: remove the physical adjacent extents rejection in defrag_check_next_extent()
  2022-01-26  0:58 [PATCH v3 1/3] btrfs: defrag: don't try to merge regular extents with preallocated extents Qu Wenruo
  2022-01-26  0:58 ` [PATCH 2/3] btrfs: defrag: use extent_thresh to replace the hardcoded size limit Qu Wenruo
@ 2022-01-26  0:58 ` Qu Wenruo
  2022-01-26 11:44   ` Filipe Manana
  2022-01-26 11:26 ` [PATCH v3 1/3] btrfs: defrag: don't try to merge regular extents with preallocated extents Filipe Manana
  2022-01-28  6:31 ` Qu Wenruo
  3 siblings, 1 reply; 16+ messages in thread
From: Qu Wenruo @ 2022-01-26  0:58 UTC (permalink / raw)
  To: linux-btrfs

There is a rejection for physically adjacent extents in
defrag_check_next_extent() from the very beginning.

The check will reject physically adjacent extents which are also large
enough.

The extent size threshold check is now a generic check, and the
benefit of rejecting physically adjacent extents is unsure.

Sure physically adjacent extents means no extra seek time, thus
defragging them may not bring much help.

On the other hand, btrfs also benefits from reduced number of extents
(which can reduce the size of extent tree, thus reduce the mount time).

So such rejection is not a full win.

Remove such check, and policy on which extents should be defragged is
mostly on @extent_thresh and @newer_than parameters.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/ioctl.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 2911df12fc48..0929d08bb378 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1069,9 +1069,6 @@ static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
 	/* Extent is already large enough */
 	if (next->len >= extent_thresh)
 		goto out;
-	/* Physically adjacent */
-	if ((em->block_start + em->block_len == next->block_start))
-		goto out;
 	ret = true;
 out:
 	free_extent_map(next);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 1/3] btrfs: defrag: don't try to merge regular extents with preallocated extents
  2022-01-26  0:58 [PATCH v3 1/3] btrfs: defrag: don't try to merge regular extents with preallocated extents Qu Wenruo
  2022-01-26  0:58 ` [PATCH 2/3] btrfs: defrag: use extent_thresh to replace the hardcoded size limit Qu Wenruo
  2022-01-26  0:58 ` [PATCH 3/3] btrfs: defrag: remove the physical adjacent extents rejection in defrag_check_next_extent() Qu Wenruo
@ 2022-01-26 11:26 ` Filipe Manana
  2022-01-26 11:33   ` Qu Wenruo
  2022-01-28  6:31 ` Qu Wenruo
  3 siblings, 1 reply; 16+ messages in thread
From: Filipe Manana @ 2022-01-26 11:26 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Wed, Jan 26, 2022 at 08:58:48AM +0800, Qu Wenruo wrote:
> [BUG]
> With older kernels (before v5.16), btrfs will defrag preallocated extents.
> While with newer kernels (v5.16 and newer) btrfs will not defrag
> preallocated extents, but it will defrag the extent just before the
> preallocated extent, even it's just a single sector.

In that case, isn't a Fixes: tag missing?

> 
> This can be exposed by the following small script:
> 
> 	mkfs.btrfs -f $dev > /dev/null
> 
> 	mount $dev $mnt
> 	xfs_io -f -c "pwrite 0 4k" -c sync -c "falloc 4k 16K" $mnt/file
> 	xfs_io -c "fiemap -v" $mnt/file
> 	btrfs fi defrag $mnt/file
> 	sync
> 	xfs_io -c "fiemap -v" $mnt/file
> 
> The output looks like this on older kernels:
> 
> /mnt/btrfs/file:
>  EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
>    0: [0..7]:          26624..26631         8   0x0
>    1: [8..39]:         26632..26663        32 0x801
> /mnt/btrfs/file:
>  EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
>    0: [0..39]:         26664..26703        40   0x1
> 
> Which defrags the single sector along with the preallocated extent, and
> replace them with an regular extent into a new location (caused by data
> COW).
> This wastes most of the data IO just for the preallocated range.
> 
> On the other hand, v5.16 is slightly better:
> 
> /mnt/btrfs/file:
>  EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
>    0: [0..7]:          26624..26631         8   0x0
>    1: [8..39]:         26632..26663        32 0x801
> /mnt/btrfs/file:
>  EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
>    0: [0..7]:          26664..26671         8   0x0
>    1: [8..39]:         26632..26663        32 0x801
> 
> The preallocated range is not defragged, but the sector before it still
> gets defragged, which has no need for it.
> 
> [CAUSE]
> One of the function reused by the old and new behavior is
> defrag_check_next_extent(), it will determine if we should defrag
> current extent by checking the next one.
> 
> It only checks if the next extent is a hole or inlined, but it doesn't
> check if it's preallocated.
> 
> On the other hand, out of the function, both old and new kernel will
> reject preallocated extents.
> 
> Such inconsistent behavior causes above behavior.
> 
> [FIX]
> - Also check if next extent is preallocated
>   If so, don't defrag current extent.
> 
> - Add comments for each branch why we reject the extent
> 
> This will reduce the IO caused by defrag ioctl and autodefrag.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> Changelog:
> v2:
> - Use @extent_thresh from caller to replace the harded coded threshold
>   Now caller has full control over the extent threshold value.
> 
> - Remove the old ambiguous check based on physical address
>   The original check is too specific, only reject extents which are
>   physically adjacent, AND too large.
>   Since we have correct size check now, and the physically adjacent check
>   is not always a win.
>   So remove the old check completely.
> 
> v3:
> - Split the @extent_thresh and physicall adjacent check into other
>   patches
> 
> - Simplify the comment
> ---
>  fs/btrfs/ioctl.c | 20 +++++++++++++-------
>  1 file changed, 13 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 91ba2efe9792..0d8bfc716e6b 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -1053,19 +1053,25 @@ static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
>  				     bool locked)
>  {
>  	struct extent_map *next;
> -	bool ret = true;
> +	bool ret = false;
>  
>  	/* this is the last extent */
>  	if (em->start + em->len >= i_size_read(inode))
> -		return false;
> +		return ret;
>  
>  	next = defrag_lookup_extent(inode, em->start + em->len, locked);
> +	/* No more em or hole */
>  	if (!next || next->block_start >= EXTENT_MAP_LAST_BYTE)
> -		ret = false;
> -	else if ((em->block_start + em->block_len == next->block_start) &&
> -		 (em->block_len > SZ_128K && next->block_len > SZ_128K))
> -		ret = false;
> -
> +		goto out;
> +	/* Preallocated */
> +	if (test_bit(EXTENT_FLAG_PREALLOC, &em->flags))

The comment is superfluous, the name of the flag is pretty informative that
we are checking for a preallocated extent. You don't need to send a new
version just to remove the comment however.

For the Fixes: tag, you can just comment and David will pick it.

Reviewed-by: Filipe Manana <fdmanana@suse.com>

Thanks.

> +		goto out;
> +	/* Physically adjacent and large enough */
> +	if ((em->block_start + em->block_len == next->block_start) &&
> +	    (em->block_len > SZ_128K && next->block_len > SZ_128K))
> +		goto out;
> +	ret = true;
> +out:
>  	free_extent_map(next);
>  	return ret;
>  }
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 1/3] btrfs: defrag: don't try to merge regular extents with preallocated extents
  2022-01-26 11:26 ` [PATCH v3 1/3] btrfs: defrag: don't try to merge regular extents with preallocated extents Filipe Manana
@ 2022-01-26 11:33   ` Qu Wenruo
  2022-01-26 11:47     ` Filipe Manana
  0 siblings, 1 reply; 16+ messages in thread
From: Qu Wenruo @ 2022-01-26 11:33 UTC (permalink / raw)
  To: Filipe Manana, Qu Wenruo; +Cc: linux-btrfs



On 2022/1/26 19:26, Filipe Manana wrote:
> On Wed, Jan 26, 2022 at 08:58:48AM +0800, Qu Wenruo wrote:
>> [BUG]
>> With older kernels (before v5.16), btrfs will defrag preallocated extents.
>> While with newer kernels (v5.16 and newer) btrfs will not defrag
>> preallocated extents, but it will defrag the extent just before the
>> preallocated extent, even it's just a single sector.
>
> In that case, isn't a Fixes: tag missing?

The function defrag_check_next_extent() is reused from older kernel
code, dating back to 2012.
(And even that 2012 commit is not really the root cause).
Thus it's missing preallocated check from the very beginning unfortunately.

Does it still make sense to include such an old commit?

It would make more sense to CC this to v5.x stable branch, though.

Any advice on this situation?

Thanks,
Qu
>
>>
>> This can be exposed by the following small script:
>>
>> 	mkfs.btrfs -f $dev > /dev/null
>>
>> 	mount $dev $mnt
>> 	xfs_io -f -c "pwrite 0 4k" -c sync -c "falloc 4k 16K" $mnt/file
>> 	xfs_io -c "fiemap -v" $mnt/file
>> 	btrfs fi defrag $mnt/file
>> 	sync
>> 	xfs_io -c "fiemap -v" $mnt/file
>>
>> The output looks like this on older kernels:
>>
>> /mnt/btrfs/file:
>>   EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
>>     0: [0..7]:          26624..26631         8   0x0
>>     1: [8..39]:         26632..26663        32 0x801
>> /mnt/btrfs/file:
>>   EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
>>     0: [0..39]:         26664..26703        40   0x1
>>
>> Which defrags the single sector along with the preallocated extent, and
>> replace them with an regular extent into a new location (caused by data
>> COW).
>> This wastes most of the data IO just for the preallocated range.
>>
>> On the other hand, v5.16 is slightly better:
>>
>> /mnt/btrfs/file:
>>   EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
>>     0: [0..7]:          26624..26631         8   0x0
>>     1: [8..39]:         26632..26663        32 0x801
>> /mnt/btrfs/file:
>>   EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
>>     0: [0..7]:          26664..26671         8   0x0
>>     1: [8..39]:         26632..26663        32 0x801
>>
>> The preallocated range is not defragged, but the sector before it still
>> gets defragged, which has no need for it.
>>
>> [CAUSE]
>> One of the function reused by the old and new behavior is
>> defrag_check_next_extent(), it will determine if we should defrag
>> current extent by checking the next one.
>>
>> It only checks if the next extent is a hole or inlined, but it doesn't
>> check if it's preallocated.
>>
>> On the other hand, out of the function, both old and new kernel will
>> reject preallocated extents.
>>
>> Such inconsistent behavior causes above behavior.
>>
>> [FIX]
>> - Also check if next extent is preallocated
>>    If so, don't defrag current extent.
>>
>> - Add comments for each branch why we reject the extent
>>
>> This will reduce the IO caused by defrag ioctl and autodefrag.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>> Changelog:
>> v2:
>> - Use @extent_thresh from caller to replace the harded coded threshold
>>    Now caller has full control over the extent threshold value.
>>
>> - Remove the old ambiguous check based on physical address
>>    The original check is too specific, only reject extents which are
>>    physically adjacent, AND too large.
>>    Since we have correct size check now, and the physically adjacent check
>>    is not always a win.
>>    So remove the old check completely.
>>
>> v3:
>> - Split the @extent_thresh and physicall adjacent check into other
>>    patches
>>
>> - Simplify the comment
>> ---
>>   fs/btrfs/ioctl.c | 20 +++++++++++++-------
>>   1 file changed, 13 insertions(+), 7 deletions(-)
>>
>> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
>> index 91ba2efe9792..0d8bfc716e6b 100644
>> --- a/fs/btrfs/ioctl.c
>> +++ b/fs/btrfs/ioctl.c
>> @@ -1053,19 +1053,25 @@ static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
>>   				     bool locked)
>>   {
>>   	struct extent_map *next;
>> -	bool ret = true;
>> +	bool ret = false;
>>
>>   	/* this is the last extent */
>>   	if (em->start + em->len >= i_size_read(inode))
>> -		return false;
>> +		return ret;
>>
>>   	next = defrag_lookup_extent(inode, em->start + em->len, locked);
>> +	/* No more em or hole */
>>   	if (!next || next->block_start >= EXTENT_MAP_LAST_BYTE)
>> -		ret = false;
>> -	else if ((em->block_start + em->block_len == next->block_start) &&
>> -		 (em->block_len > SZ_128K && next->block_len > SZ_128K))
>> -		ret = false;
>> -
>> +		goto out;
>> +	/* Preallocated */
>> +	if (test_bit(EXTENT_FLAG_PREALLOC, &em->flags))
>
> The comment is superfluous, the name of the flag is pretty informative that
> we are checking for a preallocated extent. You don't need to send a new
> version just to remove the comment however.
>
> For the Fixes: tag, you can just comment and David will pick it.
>
> Reviewed-by: Filipe Manana <fdmanana@suse.com>
>
> Thanks.
>
>> +		goto out;
>> +	/* Physically adjacent and large enough */
>> +	if ((em->block_start + em->block_len == next->block_start) &&
>> +	    (em->block_len > SZ_128K && next->block_len > SZ_128K))
>> +		goto out;
>> +	ret = true;
>> +out:
>>   	free_extent_map(next);
>>   	return ret;
>>   }
>> --
>> 2.34.1
>>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] btrfs: defrag: use extent_thresh to replace the hardcoded size limit
  2022-01-26  0:58 ` [PATCH 2/3] btrfs: defrag: use extent_thresh to replace the hardcoded size limit Qu Wenruo
@ 2022-01-26 11:40   ` Filipe Manana
  2022-01-26 12:26     ` Qu Wenruo
  0 siblings, 1 reply; 16+ messages in thread
From: Filipe Manana @ 2022-01-26 11:40 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Wed, Jan 26, 2022 at 08:58:49AM +0800, Qu Wenruo wrote:
> In defrag_lookup_extent() we use hardcoded extent size threshold, SZ_128K,
> other than @extent_thresh in btrfs_defrag_file().
> 
> This can lead to some inconsistent behavior, especially the default
> extent size threshold is 256K.
> 
> Fix this by passing @extent_thresh into defrag_check_next_extent() and
> use that value.
> 
> Also, since the extent_thresh check should be applied to all extents,
> not only physically adjacent extents, move the threshold check into a
> dedicate if ().
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  fs/btrfs/ioctl.c | 12 +++++++-----
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 0d8bfc716e6b..2911df12fc48 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -1050,7 +1050,7 @@ static struct extent_map *defrag_lookup_extent(struct inode *inode, u64 start,
>  }
>  
>  static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
> -				     bool locked)
> +				     u32 extent_thresh, bool locked)
>  {
>  	struct extent_map *next;
>  	bool ret = false;
> @@ -1066,9 +1066,11 @@ static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
>  	/* Preallocated */
>  	if (test_bit(EXTENT_FLAG_PREALLOC, &em->flags))
>  		goto out;
> -	/* Physically adjacent and large enough */
> -	if ((em->block_start + em->block_len == next->block_start) &&
> -	    (em->block_len > SZ_128K && next->block_len > SZ_128K))
> +	/* Extent is already large enough */
> +	if (next->len >= extent_thresh)
> +		goto out;

So this will trigger unnecessary rewrites of compressed extents.
The SZ_128K is there to deal with compressed extents, it has nothing to
do with the threshold passed to the ioctl.

After applying this patchset, if you run a trivial test like this:

   #!/bin/bash

   DEV=/dev/sdj
   MNT=/mnt/sdj

   mkfs.btrfs -f $DEV
   mount -o compress $DEV $MNT

   xfs_io -f -c "pwrite -S 0xab 0 128K" $MNT/foobar
   sync
   # Write to some other file so that the next extent for foobar
   # is not contiguous with the first extent.
   xfs_io -f -c "pwrite 0 128K" $MNT/baz
   sync
   xfs_io -f -c "pwrite -S 0xcd 128K 128K" $MNT/foobar
   sync

   echo -e "\n\nTree after creating file:\n\n"
   btrfs inspect-internal dump-tree -t 5 $DEV

   btrfs filesystem defragment $MNT/foobar
   sync

   echo -e "\n\nTree after defrag:\n\n"
   btrfs inspect-internal dump-tree -t 5 $DEV

   umount $MNT

It will result in rewriting the two 128K compressed extents:

(...)
Tree after write and sync:

btrfs-progs v5.12.1 
fs tree key (FS_TREE ROOT_ITEM 0) 
(...)
	item 7 key (257 INODE_REF 256) itemoff 15797 itemsize 16
		index 2 namelen 6 name: foobar
	item 8 key (257 EXTENT_DATA 0) itemoff 15744 itemsize 53
		generation 6 type 1 (regular)
		extent data disk byte 13631488 nr 4096
		extent data offset 0 nr 131072 ram 131072
		extent compression 1 (zlib)
	item 9 key (257 EXTENT_DATA 131072) itemoff 15691 itemsize 53
		generation 8 type 1 (regular)
		extent data disk byte 14163968 nr 4096
		extent data offset 0 nr 131072 ram 131072
		extent compression 1 (zlib)
(...)

Tree after defrag:

btrfs-progs v5.12.1 
fs tree key (FS_TREE ROOT_ITEM 0) 
(...)
	item 7 key (257 INODE_REF 256) itemoff 15797 itemsize 16
		index 2 namelen 6 name: foobar
	item 8 key (257 EXTENT_DATA 0) itemoff 15744 itemsize 53
		generation 9 type 1 (regular)
		extent data disk byte 14430208 nr 4096
		extent data offset 0 nr 131072 ram 131072
		extent compression 1 (zlib)
	item 9 key (257 EXTENT_DATA 131072) itemoff 15691 itemsize 53
		generation 9 type 1 (regular)
		extent data disk byte 13635584 nr 4096
		extent data offset 0 nr 131072 ram 131072
		extent compression 1 (zlib)

In other words, a waste of IO and CPU time.

So it needs to check if we are dealing with compressed extents, and
if so, skip either of them has a size of SZ_128K (and changelog updated).

Thanks.

> +	/* Physically adjacent */
> +	if ((em->block_start + em->block_len == next->block_start))
>  		goto out;
>  	ret = true;
>  out:
> @@ -1231,7 +1233,7 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
>  			goto next;
>  
>  		next_mergeable = defrag_check_next_extent(&inode->vfs_inode, em,
> -							  locked);
> +							  extent_thresh, locked);
>  		if (!next_mergeable) {
>  			struct defrag_target_range *last;
>  
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] btrfs: defrag: remove the physical adjacent extents rejection in defrag_check_next_extent()
  2022-01-26  0:58 ` [PATCH 3/3] btrfs: defrag: remove the physical adjacent extents rejection in defrag_check_next_extent() Qu Wenruo
@ 2022-01-26 11:44   ` Filipe Manana
  0 siblings, 0 replies; 16+ messages in thread
From: Filipe Manana @ 2022-01-26 11:44 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Wed, Jan 26, 2022 at 08:58:50AM +0800, Qu Wenruo wrote:
> There is a rejection for physically adjacent extents in
> defrag_check_next_extent() from the very beginning.
> 
> The check will reject physically adjacent extents which are also large
> enough.
> 
> The extent size threshold check is now a generic check, and the
> benefit of rejecting physically adjacent extents is unsure.
> 
> Sure physically adjacent extents means no extra seek time, thus
> defragging them may not bring much help.
> 
> On the other hand, btrfs also benefits from reduced number of extents
> (which can reduce the size of extent tree, thus reduce the mount time).

And also reduce the number of file extent items in the subvolume's tree.

I think it's fine.

Reviewed-by: Filipe Manana <fdmanana@suse.com>

Thanks.

> 
> So such rejection is not a full win.
> 
> Remove such check, and policy on which extents should be defragged is
> mostly on @extent_thresh and @newer_than parameters.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  fs/btrfs/ioctl.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 2911df12fc48..0929d08bb378 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -1069,9 +1069,6 @@ static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
>  	/* Extent is already large enough */
>  	if (next->len >= extent_thresh)
>  		goto out;
> -	/* Physically adjacent */
> -	if ((em->block_start + em->block_len == next->block_start))
> -		goto out;
>  	ret = true;
>  out:
>  	free_extent_map(next);
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 1/3] btrfs: defrag: don't try to merge regular extents with preallocated extents
  2022-01-26 11:33   ` Qu Wenruo
@ 2022-01-26 11:47     ` Filipe Manana
  0 siblings, 0 replies; 16+ messages in thread
From: Filipe Manana @ 2022-01-26 11:47 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Qu Wenruo, linux-btrfs

On Wed, Jan 26, 2022 at 11:33 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>
>
>
> On 2022/1/26 19:26, Filipe Manana wrote:
> > On Wed, Jan 26, 2022 at 08:58:48AM +0800, Qu Wenruo wrote:
> >> [BUG]
> >> With older kernels (before v5.16), btrfs will defrag preallocated extents.
> >> While with newer kernels (v5.16 and newer) btrfs will not defrag
> >> preallocated extents, but it will defrag the extent just before the
> >> preallocated extent, even it's just a single sector.
> >
> > In that case, isn't a Fixes: tag missing?
>
> The function defrag_check_next_extent() is reused from older kernel
> code, dating back to 2012.
> (And even that 2012 commit is not really the root cause).
> Thus it's missing preallocated check from the very beginning unfortunately.
>
> Does it still make sense to include such an old commit?
>
> It would make more sense to CC this to v5.x stable branch, though.
>
> Any advice on this situation?

Hum, ok, then I think it's fine without the tag.
Thanks.

>
> Thanks,
> Qu
> >
> >>
> >> This can be exposed by the following small script:
> >>
> >>      mkfs.btrfs -f $dev > /dev/null
> >>
> >>      mount $dev $mnt
> >>      xfs_io -f -c "pwrite 0 4k" -c sync -c "falloc 4k 16K" $mnt/file
> >>      xfs_io -c "fiemap -v" $mnt/file
> >>      btrfs fi defrag $mnt/file
> >>      sync
> >>      xfs_io -c "fiemap -v" $mnt/file
> >>
> >> The output looks like this on older kernels:
> >>
> >> /mnt/btrfs/file:
> >>   EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
> >>     0: [0..7]:          26624..26631         8   0x0
> >>     1: [8..39]:         26632..26663        32 0x801
> >> /mnt/btrfs/file:
> >>   EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
> >>     0: [0..39]:         26664..26703        40   0x1
> >>
> >> Which defrags the single sector along with the preallocated extent, and
> >> replace them with an regular extent into a new location (caused by data
> >> COW).
> >> This wastes most of the data IO just for the preallocated range.
> >>
> >> On the other hand, v5.16 is slightly better:
> >>
> >> /mnt/btrfs/file:
> >>   EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
> >>     0: [0..7]:          26624..26631         8   0x0
> >>     1: [8..39]:         26632..26663        32 0x801
> >> /mnt/btrfs/file:
> >>   EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
> >>     0: [0..7]:          26664..26671         8   0x0
> >>     1: [8..39]:         26632..26663        32 0x801
> >>
> >> The preallocated range is not defragged, but the sector before it still
> >> gets defragged, which has no need for it.
> >>
> >> [CAUSE]
> >> One of the function reused by the old and new behavior is
> >> defrag_check_next_extent(), it will determine if we should defrag
> >> current extent by checking the next one.
> >>
> >> It only checks if the next extent is a hole or inlined, but it doesn't
> >> check if it's preallocated.
> >>
> >> On the other hand, out of the function, both old and new kernel will
> >> reject preallocated extents.
> >>
> >> Such inconsistent behavior causes above behavior.
> >>
> >> [FIX]
> >> - Also check if next extent is preallocated
> >>    If so, don't defrag current extent.
> >>
> >> - Add comments for each branch why we reject the extent
> >>
> >> This will reduce the IO caused by defrag ioctl and autodefrag.
> >>
> >> Signed-off-by: Qu Wenruo <wqu@suse.com>
> >> ---
> >> Changelog:
> >> v2:
> >> - Use @extent_thresh from caller to replace the harded coded threshold
> >>    Now caller has full control over the extent threshold value.
> >>
> >> - Remove the old ambiguous check based on physical address
> >>    The original check is too specific, only reject extents which are
> >>    physically adjacent, AND too large.
> >>    Since we have correct size check now, and the physically adjacent check
> >>    is not always a win.
> >>    So remove the old check completely.
> >>
> >> v3:
> >> - Split the @extent_thresh and physicall adjacent check into other
> >>    patches
> >>
> >> - Simplify the comment
> >> ---
> >>   fs/btrfs/ioctl.c | 20 +++++++++++++-------
> >>   1 file changed, 13 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> >> index 91ba2efe9792..0d8bfc716e6b 100644
> >> --- a/fs/btrfs/ioctl.c
> >> +++ b/fs/btrfs/ioctl.c
> >> @@ -1053,19 +1053,25 @@ static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
> >>                                   bool locked)
> >>   {
> >>      struct extent_map *next;
> >> -    bool ret = true;
> >> +    bool ret = false;
> >>
> >>      /* this is the last extent */
> >>      if (em->start + em->len >= i_size_read(inode))
> >> -            return false;
> >> +            return ret;
> >>
> >>      next = defrag_lookup_extent(inode, em->start + em->len, locked);
> >> +    /* No more em or hole */
> >>      if (!next || next->block_start >= EXTENT_MAP_LAST_BYTE)
> >> -            ret = false;
> >> -    else if ((em->block_start + em->block_len == next->block_start) &&
> >> -             (em->block_len > SZ_128K && next->block_len > SZ_128K))
> >> -            ret = false;
> >> -
> >> +            goto out;
> >> +    /* Preallocated */
> >> +    if (test_bit(EXTENT_FLAG_PREALLOC, &em->flags))
> >
> > The comment is superfluous, the name of the flag is pretty informative that
> > we are checking for a preallocated extent. You don't need to send a new
> > version just to remove the comment however.
> >
> > For the Fixes: tag, you can just comment and David will pick it.
> >
> > Reviewed-by: Filipe Manana <fdmanana@suse.com>
> >
> > Thanks.
> >
> >> +            goto out;
> >> +    /* Physically adjacent and large enough */
> >> +    if ((em->block_start + em->block_len == next->block_start) &&
> >> +        (em->block_len > SZ_128K && next->block_len > SZ_128K))
> >> +            goto out;
> >> +    ret = true;
> >> +out:
> >>      free_extent_map(next);
> >>      return ret;
> >>   }
> >> --
> >> 2.34.1
> >>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] btrfs: defrag: use extent_thresh to replace the hardcoded size limit
  2022-01-26 11:40   ` Filipe Manana
@ 2022-01-26 12:26     ` Qu Wenruo
  2022-01-26 12:36       ` Filipe Manana
  0 siblings, 1 reply; 16+ messages in thread
From: Qu Wenruo @ 2022-01-26 12:26 UTC (permalink / raw)
  To: Filipe Manana; +Cc: linux-btrfs



On 2022/1/26 19:40, Filipe Manana wrote:
> On Wed, Jan 26, 2022 at 08:58:49AM +0800, Qu Wenruo wrote:
>> In defrag_lookup_extent() we use hardcoded extent size threshold, SZ_128K,
>> other than @extent_thresh in btrfs_defrag_file().
>>
>> This can lead to some inconsistent behavior, especially the default
>> extent size threshold is 256K.
>>
>> Fix this by passing @extent_thresh into defrag_check_next_extent() and
>> use that value.
>>
>> Also, since the extent_thresh check should be applied to all extents,
>> not only physically adjacent extents, move the threshold check into a
>> dedicate if ().
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>   fs/btrfs/ioctl.c | 12 +++++++-----
>>   1 file changed, 7 insertions(+), 5 deletions(-)
>>
>> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
>> index 0d8bfc716e6b..2911df12fc48 100644
>> --- a/fs/btrfs/ioctl.c
>> +++ b/fs/btrfs/ioctl.c
>> @@ -1050,7 +1050,7 @@ static struct extent_map *defrag_lookup_extent(struct inode *inode, u64 start,
>>   }
>>   
>>   static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
>> -				     bool locked)
>> +				     u32 extent_thresh, bool locked)
>>   {
>>   	struct extent_map *next;
>>   	bool ret = false;
>> @@ -1066,9 +1066,11 @@ static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
>>   	/* Preallocated */
>>   	if (test_bit(EXTENT_FLAG_PREALLOC, &em->flags))
>>   		goto out;
>> -	/* Physically adjacent and large enough */
>> -	if ((em->block_start + em->block_len == next->block_start) &&
>> -	    (em->block_len > SZ_128K && next->block_len > SZ_128K))
>> +	/* Extent is already large enough */
>> +	if (next->len >= extent_thresh)
>> +		goto out;
> 
> So this will trigger unnecessary rewrites of compressed extents.
> The SZ_128K is there to deal with compressed extents, it has nothing to
> do with the threshold passed to the ioctl.

Then there is still something wrong.

The original check will only reject it when both conditions are met.

So based on your script, I can still find a way to defrag the extents, 
with or without this modification:

	mkfs.btrfs -f $DEV
	mount -o compress $DEV $MNT
	
	xfs_io -f -c "pwrite -S 0xab 0 128K" $MNT/file1
	sync
	xfs_io -f -c "pwrite -S 0xab 0 128K" $MNT/file2
	sync
	xfs_io -f -c "pwrite -S 0xab 128K 128K" $MNT/file1
	sync

	echo "=== file1 before defrag ==="
	xfs_io -f -c "fiemap -v" $MNT/file1
	echo "=== file1 after defrag ==="
	btrfs fi defrag $MNT/file1
	sync
	xfs_io -f -c "fiemap -v" $MNT/file1

The output looks like this:

=== before ===
/mnt/btrfs/file1:
  EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
    0: [0..255]:        26624..26879       256   0x8
    1: [256..511]:      26640..26895       256   0x9
=== after ===
/mnt/btrfs/file1:
  EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
    0: [0..255]:        26648..26903       256   0x8
    1: [256..511]:      26656..26911       256   0x9

No matter if the patch is applied, the result is the same.

So thank you very much for finding another case we're not handling well...


BTW, if the check is want to reject adjacent non-compressed extent, the 
original one is still incorrect, we can have extents smaller than 128K 
and is still uncompressed.

So what we really want is to reject physically adjacent, non-compressed 
extents?

Thanks,
Qu
> 
> After applying this patchset, if you run a trivial test like this:
> 
>     #!/bin/bash
> 
>     DEV=/dev/sdj
>     MNT=/mnt/sdj
> 
>     mkfs.btrfs -f $DEV
>     mount -o compress $DEV $MNT
> 
>     xfs_io -f -c "pwrite -S 0xab 0 128K" $MNT/foobar
>     sync
>     # Write to some other file so that the next extent for foobar
>     # is not contiguous with the first extent.
>     xfs_io -f -c "pwrite 0 128K" $MNT/baz
>     sync
>     xfs_io -f -c "pwrite -S 0xcd 128K 128K" $MNT/foobar
>     sync
> 
>     echo -e "\n\nTree after creating file:\n\n"
>     btrfs inspect-internal dump-tree -t 5 $DEV
> 
>     btrfs filesystem defragment $MNT/foobar
>     sync
> 
>     echo -e "\n\nTree after defrag:\n\n"
>     btrfs inspect-internal dump-tree -t 5 $DEV
> 
>     umount $MNT
> 
> It will result in rewriting the two 128K compressed extents:
> 
> (...)
> Tree after write and sync:
> 
> btrfs-progs v5.12.1
> fs tree key (FS_TREE ROOT_ITEM 0)
> (...)
> 	item 7 key (257 INODE_REF 256) itemoff 15797 itemsize 16
> 		index 2 namelen 6 name: foobar
> 	item 8 key (257 EXTENT_DATA 0) itemoff 15744 itemsize 53
> 		generation 6 type 1 (regular)
> 		extent data disk byte 13631488 nr 4096
> 		extent data offset 0 nr 131072 ram 131072
> 		extent compression 1 (zlib)
> 	item 9 key (257 EXTENT_DATA 131072) itemoff 15691 itemsize 53
> 		generation 8 type 1 (regular)
> 		extent data disk byte 14163968 nr 4096
> 		extent data offset 0 nr 131072 ram 131072
> 		extent compression 1 (zlib)
> (...)
> 
> Tree after defrag:
> 
> btrfs-progs v5.12.1
> fs tree key (FS_TREE ROOT_ITEM 0)
> (...)
> 	item 7 key (257 INODE_REF 256) itemoff 15797 itemsize 16
> 		index 2 namelen 6 name: foobar
> 	item 8 key (257 EXTENT_DATA 0) itemoff 15744 itemsize 53
> 		generation 9 type 1 (regular)
> 		extent data disk byte 14430208 nr 4096
> 		extent data offset 0 nr 131072 ram 131072
> 		extent compression 1 (zlib)
> 	item 9 key (257 EXTENT_DATA 131072) itemoff 15691 itemsize 53
> 		generation 9 type 1 (regular)
> 		extent data disk byte 13635584 nr 4096
> 		extent data offset 0 nr 131072 ram 131072
> 		extent compression 1 (zlib)
> 
> In other words, a waste of IO and CPU time.
> 
> So it needs to check if we are dealing with compressed extents, and
> if so, skip either of them has a size of SZ_128K (and changelog updated).
> 
> Thanks.
> 
>> +	/* Physically adjacent */
>> +	if ((em->block_start + em->block_len == next->block_start))
>>   		goto out;
>>   	ret = true;
>>   out:
>> @@ -1231,7 +1233,7 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
>>   			goto next;
>>   
>>   		next_mergeable = defrag_check_next_extent(&inode->vfs_inode, em,
>> -							  locked);
>> +							  extent_thresh, locked);
>>   		if (!next_mergeable) {
>>   			struct defrag_target_range *last;
>>   
>> -- 
>> 2.34.1
>>
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] btrfs: defrag: use extent_thresh to replace the hardcoded size limit
  2022-01-26 12:26     ` Qu Wenruo
@ 2022-01-26 12:36       ` Filipe Manana
  2022-01-26 13:00         ` Qu Wenruo
  0 siblings, 1 reply; 16+ messages in thread
From: Filipe Manana @ 2022-01-26 12:36 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Wed, Jan 26, 2022 at 12:26 PM Qu Wenruo <wqu@suse.com> wrote:
>
>
>
> On 2022/1/26 19:40, Filipe Manana wrote:
> > On Wed, Jan 26, 2022 at 08:58:49AM +0800, Qu Wenruo wrote:
> >> In defrag_lookup_extent() we use hardcoded extent size threshold, SZ_128K,
> >> other than @extent_thresh in btrfs_defrag_file().
> >>
> >> This can lead to some inconsistent behavior, especially the default
> >> extent size threshold is 256K.
> >>
> >> Fix this by passing @extent_thresh into defrag_check_next_extent() and
> >> use that value.
> >>
> >> Also, since the extent_thresh check should be applied to all extents,
> >> not only physically adjacent extents, move the threshold check into a
> >> dedicate if ().
> >>
> >> Signed-off-by: Qu Wenruo <wqu@suse.com>
> >> ---
> >>   fs/btrfs/ioctl.c | 12 +++++++-----
> >>   1 file changed, 7 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> >> index 0d8bfc716e6b..2911df12fc48 100644
> >> --- a/fs/btrfs/ioctl.c
> >> +++ b/fs/btrfs/ioctl.c
> >> @@ -1050,7 +1050,7 @@ static struct extent_map *defrag_lookup_extent(struct inode *inode, u64 start,
> >>   }
> >>
> >>   static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
> >> -                                 bool locked)
> >> +                                 u32 extent_thresh, bool locked)
> >>   {
> >>      struct extent_map *next;
> >>      bool ret = false;
> >> @@ -1066,9 +1066,11 @@ static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
> >>      /* Preallocated */
> >>      if (test_bit(EXTENT_FLAG_PREALLOC, &em->flags))
> >>              goto out;
> >> -    /* Physically adjacent and large enough */
> >> -    if ((em->block_start + em->block_len == next->block_start) &&
> >> -        (em->block_len > SZ_128K && next->block_len > SZ_128K))
> >> +    /* Extent is already large enough */
> >> +    if (next->len >= extent_thresh)
> >> +            goto out;
> >
> > So this will trigger unnecessary rewrites of compressed extents.
> > The SZ_128K is there to deal with compressed extents, it has nothing to
> > do with the threshold passed to the ioctl.
>
> Then there is still something wrong.
>
> The original check will only reject it when both conditions are met.
>
> So based on your script, I can still find a way to defrag the extents,
> with or without this modification:

Right, without the intermediary write to file "baz", this patchset
brings a regression in regards to
compressed extents - when they are adjacent, which is typically the
case when doing large writes,
as they'll create multiple extents covering consecutive 128K ranges.

With the write to file "baz", as I pasted it, it happens before and
after the patchset.

>
>         mkfs.btrfs -f $DEV
>         mount -o compress $DEV $MNT
>
>         xfs_io -f -c "pwrite -S 0xab 0 128K" $MNT/file1
>         sync
>         xfs_io -f -c "pwrite -S 0xab 0 128K" $MNT/file2
>         sync
>         xfs_io -f -c "pwrite -S 0xab 128K 128K" $MNT/file1
>         sync
>
>         echo "=== file1 before defrag ==="
>         xfs_io -f -c "fiemap -v" $MNT/file1
>         echo "=== file1 after defrag ==="
>         btrfs fi defrag $MNT/file1
>         sync
>         xfs_io -f -c "fiemap -v" $MNT/file1
>
> The output looks like this:
>
> === before ===
> /mnt/btrfs/file1:
>   EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
>     0: [0..255]:        26624..26879       256   0x8
>     1: [256..511]:      26640..26895       256   0x9
> === after ===
> /mnt/btrfs/file1:
>   EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
>     0: [0..255]:        26648..26903       256   0x8
>     1: [256..511]:      26656..26911       256   0x9
>
> No matter if the patch is applied, the result is the same.

Yes, explained above.

>
> So thank you very much for finding another case we're not handling well...
>
>
> BTW, if the check is want to reject adjacent non-compressed extent, the
> original one is still incorrect, we can have extents smaller than 128K
> and is still uncompressed.
>
> So what we really want is to reject physically adjacent, non-compressed
> extents?

We want to avoid doing work that does nothing.
If 2 consecutive extents are compressed and at least one is already
128K, then it's a waste of time, IO and CPU.

And that's a fairly common scenario. Do a one megabyte write for
example, then after writeback we end up with several 128K extents with
compression.
In that case defrag should do nothing for the whole range.


>
> Thanks,
> Qu
> >
> > After applying this patchset, if you run a trivial test like this:
> >
> >     #!/bin/bash
> >
> >     DEV=/dev/sdj
> >     MNT=/mnt/sdj
> >
> >     mkfs.btrfs -f $DEV
> >     mount -o compress $DEV $MNT
> >
> >     xfs_io -f -c "pwrite -S 0xab 0 128K" $MNT/foobar
> >     sync
> >     # Write to some other file so that the next extent for foobar
> >     # is not contiguous with the first extent.
> >     xfs_io -f -c "pwrite 0 128K" $MNT/baz
> >     sync
> >     xfs_io -f -c "pwrite -S 0xcd 128K 128K" $MNT/foobar
> >     sync
> >
> >     echo -e "\n\nTree after creating file:\n\n"
> >     btrfs inspect-internal dump-tree -t 5 $DEV
> >
> >     btrfs filesystem defragment $MNT/foobar
> >     sync
> >
> >     echo -e "\n\nTree after defrag:\n\n"
> >     btrfs inspect-internal dump-tree -t 5 $DEV
> >
> >     umount $MNT
> >
> > It will result in rewriting the two 128K compressed extents:
> >
> > (...)
> > Tree after write and sync:
> >
> > btrfs-progs v5.12.1
> > fs tree key (FS_TREE ROOT_ITEM 0)
> > (...)
> >       item 7 key (257 INODE_REF 256) itemoff 15797 itemsize 16
> >               index 2 namelen 6 name: foobar
> >       item 8 key (257 EXTENT_DATA 0) itemoff 15744 itemsize 53
> >               generation 6 type 1 (regular)
> >               extent data disk byte 13631488 nr 4096
> >               extent data offset 0 nr 131072 ram 131072
> >               extent compression 1 (zlib)
> >       item 9 key (257 EXTENT_DATA 131072) itemoff 15691 itemsize 53
> >               generation 8 type 1 (regular)
> >               extent data disk byte 14163968 nr 4096
> >               extent data offset 0 nr 131072 ram 131072
> >               extent compression 1 (zlib)
> > (...)
> >
> > Tree after defrag:
> >
> > btrfs-progs v5.12.1
> > fs tree key (FS_TREE ROOT_ITEM 0)
> > (...)
> >       item 7 key (257 INODE_REF 256) itemoff 15797 itemsize 16
> >               index 2 namelen 6 name: foobar
> >       item 8 key (257 EXTENT_DATA 0) itemoff 15744 itemsize 53
> >               generation 9 type 1 (regular)
> >               extent data disk byte 14430208 nr 4096
> >               extent data offset 0 nr 131072 ram 131072
> >               extent compression 1 (zlib)
> >       item 9 key (257 EXTENT_DATA 131072) itemoff 15691 itemsize 53
> >               generation 9 type 1 (regular)
> >               extent data disk byte 13635584 nr 4096
> >               extent data offset 0 nr 131072 ram 131072
> >               extent compression 1 (zlib)
> >
> > In other words, a waste of IO and CPU time.
> >
> > So it needs to check if we are dealing with compressed extents, and
> > if so, skip either of them has a size of SZ_128K (and changelog updated).
> >
> > Thanks.
> >
> >> +    /* Physically adjacent */
> >> +    if ((em->block_start + em->block_len == next->block_start))
> >>              goto out;
> >>      ret = true;
> >>   out:
> >> @@ -1231,7 +1233,7 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
> >>                      goto next;
> >>
> >>              next_mergeable = defrag_check_next_extent(&inode->vfs_inode, em,
> >> -                                                      locked);
> >> +                                                      extent_thresh, locked);
> >>              if (!next_mergeable) {
> >>                      struct defrag_target_range *last;
> >>
> >> --
> >> 2.34.1
> >>
> >
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] btrfs: defrag: use extent_thresh to replace the hardcoded size limit
  2022-01-26 12:36       ` Filipe Manana
@ 2022-01-26 13:00         ` Qu Wenruo
  2022-01-26 13:37           ` Filipe Manana
  0 siblings, 1 reply; 16+ messages in thread
From: Qu Wenruo @ 2022-01-26 13:00 UTC (permalink / raw)
  To: Filipe Manana, Qu Wenruo; +Cc: linux-btrfs



On 2022/1/26 20:36, Filipe Manana wrote:
> On Wed, Jan 26, 2022 at 12:26 PM Qu Wenruo <wqu@suse.com> wrote:
>>
>>
>>
>> On 2022/1/26 19:40, Filipe Manana wrote:
>>> On Wed, Jan 26, 2022 at 08:58:49AM +0800, Qu Wenruo wrote:
>>>> In defrag_lookup_extent() we use hardcoded extent size threshold, SZ_128K,
>>>> other than @extent_thresh in btrfs_defrag_file().
>>>>
>>>> This can lead to some inconsistent behavior, especially the default
>>>> extent size threshold is 256K.
>>>>
>>>> Fix this by passing @extent_thresh into defrag_check_next_extent() and
>>>> use that value.
>>>>
>>>> Also, since the extent_thresh check should be applied to all extents,
>>>> not only physically adjacent extents, move the threshold check into a
>>>> dedicate if ().
>>>>
>>>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>>>> ---
>>>>    fs/btrfs/ioctl.c | 12 +++++++-----
>>>>    1 file changed, 7 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
>>>> index 0d8bfc716e6b..2911df12fc48 100644
>>>> --- a/fs/btrfs/ioctl.c
>>>> +++ b/fs/btrfs/ioctl.c
>>>> @@ -1050,7 +1050,7 @@ static struct extent_map *defrag_lookup_extent(struct inode *inode, u64 start,
>>>>    }
>>>>
>>>>    static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
>>>> -                                 bool locked)
>>>> +                                 u32 extent_thresh, bool locked)
>>>>    {
>>>>       struct extent_map *next;
>>>>       bool ret = false;
>>>> @@ -1066,9 +1066,11 @@ static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
>>>>       /* Preallocated */
>>>>       if (test_bit(EXTENT_FLAG_PREALLOC, &em->flags))
>>>>               goto out;
>>>> -    /* Physically adjacent and large enough */
>>>> -    if ((em->block_start + em->block_len == next->block_start) &&
>>>> -        (em->block_len > SZ_128K && next->block_len > SZ_128K))
>>>> +    /* Extent is already large enough */
>>>> +    if (next->len >= extent_thresh)
>>>> +            goto out;
>>>
>>> So this will trigger unnecessary rewrites of compressed extents.
>>> The SZ_128K is there to deal with compressed extents, it has nothing to
>>> do with the threshold passed to the ioctl.
>>
>> Then there is still something wrong.
>>
>> The original check will only reject it when both conditions are met.
>>
>> So based on your script, I can still find a way to defrag the extents,
>> with or without this modification:
>
> Right, without the intermediary write to file "baz", this patchset
> brings a regression in regards to
> compressed extents - when they are adjacent, which is typically the
> case when doing large writes,
> as they'll create multiple extents covering consecutive 128K ranges.
>
> With the write to file "baz", as I pasted it, it happens before and
> after the patchset.
>
>>
>>          mkfs.btrfs -f $DEV
>>          mount -o compress $DEV $MNT
>>
>>          xfs_io -f -c "pwrite -S 0xab 0 128K" $MNT/file1
>>          sync
>>          xfs_io -f -c "pwrite -S 0xab 0 128K" $MNT/file2
>>          sync
>>          xfs_io -f -c "pwrite -S 0xab 128K 128K" $MNT/file1
>>          sync
>>
>>          echo "=== file1 before defrag ==="
>>          xfs_io -f -c "fiemap -v" $MNT/file1
>>          echo "=== file1 after defrag ==="
>>          btrfs fi defrag $MNT/file1
>>          sync
>>          xfs_io -f -c "fiemap -v" $MNT/file1
>>
>> The output looks like this:
>>
>> === before ===
>> /mnt/btrfs/file1:
>>    EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
>>      0: [0..255]:        26624..26879       256   0x8
>>      1: [256..511]:      26640..26895       256   0x9
>> === after ===
>> /mnt/btrfs/file1:
>>    EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
>>      0: [0..255]:        26648..26903       256   0x8
>>      1: [256..511]:      26656..26911       256   0x9
>>
>> No matter if the patch is applied, the result is the same.
>
> Yes, explained above.
>
>>
>> So thank you very much for finding another case we're not handling well...
>>
>>
>> BTW, if the check is want to reject adjacent non-compressed extent, the
>> original one is still incorrect, we can have extents smaller than 128K
>> and is still uncompressed.
>>
>> So what we really want is to reject physically adjacent, non-compressed
>> extents?
>
> We want to avoid doing work that does nothing.
> If 2 consecutive extents are compressed and at least one is already
> 128K, then it's a waste of time, IO and CPU.

So can we define the behavior like this?

  If the extent is already at its max capacity (compressed 128K,
   non-compressed 128M), we don't defrag it.

This also means, we need to do the same check in
defrag_collect_targets() to avoid defragging such extent.

Thanks,
Qu


>
> And that's a fairly common scenario. Do a one megabyte write for
> example, then after writeback we end up with several 128K extents with
> compression.
> In that case defrag should do nothing for the whole range.
>
>
>>
>> Thanks,
>> Qu
>>>
>>> After applying this patchset, if you run a trivial test like this:
>>>
>>>      #!/bin/bash
>>>
>>>      DEV=/dev/sdj
>>>      MNT=/mnt/sdj
>>>
>>>      mkfs.btrfs -f $DEV
>>>      mount -o compress $DEV $MNT
>>>
>>>      xfs_io -f -c "pwrite -S 0xab 0 128K" $MNT/foobar
>>>      sync
>>>      # Write to some other file so that the next extent for foobar
>>>      # is not contiguous with the first extent.
>>>      xfs_io -f -c "pwrite 0 128K" $MNT/baz
>>>      sync
>>>      xfs_io -f -c "pwrite -S 0xcd 128K 128K" $MNT/foobar
>>>      sync
>>>
>>>      echo -e "\n\nTree after creating file:\n\n"
>>>      btrfs inspect-internal dump-tree -t 5 $DEV
>>>
>>>      btrfs filesystem defragment $MNT/foobar
>>>      sync
>>>
>>>      echo -e "\n\nTree after defrag:\n\n"
>>>      btrfs inspect-internal dump-tree -t 5 $DEV
>>>
>>>      umount $MNT
>>>
>>> It will result in rewriting the two 128K compressed extents:
>>>
>>> (...)
>>> Tree after write and sync:
>>>
>>> btrfs-progs v5.12.1
>>> fs tree key (FS_TREE ROOT_ITEM 0)
>>> (...)
>>>        item 7 key (257 INODE_REF 256) itemoff 15797 itemsize 16
>>>                index 2 namelen 6 name: foobar
>>>        item 8 key (257 EXTENT_DATA 0) itemoff 15744 itemsize 53
>>>                generation 6 type 1 (regular)
>>>                extent data disk byte 13631488 nr 4096
>>>                extent data offset 0 nr 131072 ram 131072
>>>                extent compression 1 (zlib)
>>>        item 9 key (257 EXTENT_DATA 131072) itemoff 15691 itemsize 53
>>>                generation 8 type 1 (regular)
>>>                extent data disk byte 14163968 nr 4096
>>>                extent data offset 0 nr 131072 ram 131072
>>>                extent compression 1 (zlib)
>>> (...)
>>>
>>> Tree after defrag:
>>>
>>> btrfs-progs v5.12.1
>>> fs tree key (FS_TREE ROOT_ITEM 0)
>>> (...)
>>>        item 7 key (257 INODE_REF 256) itemoff 15797 itemsize 16
>>>                index 2 namelen 6 name: foobar
>>>        item 8 key (257 EXTENT_DATA 0) itemoff 15744 itemsize 53
>>>                generation 9 type 1 (regular)
>>>                extent data disk byte 14430208 nr 4096
>>>                extent data offset 0 nr 131072 ram 131072
>>>                extent compression 1 (zlib)
>>>        item 9 key (257 EXTENT_DATA 131072) itemoff 15691 itemsize 53
>>>                generation 9 type 1 (regular)
>>>                extent data disk byte 13635584 nr 4096
>>>                extent data offset 0 nr 131072 ram 131072
>>>                extent compression 1 (zlib)
>>>
>>> In other words, a waste of IO and CPU time.
>>>
>>> So it needs to check if we are dealing with compressed extents, and
>>> if so, skip either of them has a size of SZ_128K (and changelog updated).
>>>
>>> Thanks.
>>>
>>>> +    /* Physically adjacent */
>>>> +    if ((em->block_start + em->block_len == next->block_start))
>>>>               goto out;
>>>>       ret = true;
>>>>    out:
>>>> @@ -1231,7 +1233,7 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
>>>>                       goto next;
>>>>
>>>>               next_mergeable = defrag_check_next_extent(&inode->vfs_inode, em,
>>>> -                                                      locked);
>>>> +                                                      extent_thresh, locked);
>>>>               if (!next_mergeable) {
>>>>                       struct defrag_target_range *last;
>>>>
>>>> --
>>>> 2.34.1
>>>>
>>>
>>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] btrfs: defrag: use extent_thresh to replace the hardcoded size limit
  2022-01-26 13:00         ` Qu Wenruo
@ 2022-01-26 13:37           ` Filipe Manana
  2022-01-26 23:57             ` Qu Wenruo
  0 siblings, 1 reply; 16+ messages in thread
From: Filipe Manana @ 2022-01-26 13:37 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Qu Wenruo, linux-btrfs

On Wed, Jan 26, 2022 at 1:00 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>
>
>
> On 2022/1/26 20:36, Filipe Manana wrote:
> > On Wed, Jan 26, 2022 at 12:26 PM Qu Wenruo <wqu@suse.com> wrote:
> >>
> >>
> >>
> >> On 2022/1/26 19:40, Filipe Manana wrote:
> >>> On Wed, Jan 26, 2022 at 08:58:49AM +0800, Qu Wenruo wrote:
> >>>> In defrag_lookup_extent() we use hardcoded extent size threshold, SZ_128K,
> >>>> other than @extent_thresh in btrfs_defrag_file().
> >>>>
> >>>> This can lead to some inconsistent behavior, especially the default
> >>>> extent size threshold is 256K.
> >>>>
> >>>> Fix this by passing @extent_thresh into defrag_check_next_extent() and
> >>>> use that value.
> >>>>
> >>>> Also, since the extent_thresh check should be applied to all extents,
> >>>> not only physically adjacent extents, move the threshold check into a
> >>>> dedicate if ().
> >>>>
> >>>> Signed-off-by: Qu Wenruo <wqu@suse.com>
> >>>> ---
> >>>>    fs/btrfs/ioctl.c | 12 +++++++-----
> >>>>    1 file changed, 7 insertions(+), 5 deletions(-)
> >>>>
> >>>> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> >>>> index 0d8bfc716e6b..2911df12fc48 100644
> >>>> --- a/fs/btrfs/ioctl.c
> >>>> +++ b/fs/btrfs/ioctl.c
> >>>> @@ -1050,7 +1050,7 @@ static struct extent_map *defrag_lookup_extent(struct inode *inode, u64 start,
> >>>>    }
> >>>>
> >>>>    static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
> >>>> -                                 bool locked)
> >>>> +                                 u32 extent_thresh, bool locked)
> >>>>    {
> >>>>       struct extent_map *next;
> >>>>       bool ret = false;
> >>>> @@ -1066,9 +1066,11 @@ static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
> >>>>       /* Preallocated */
> >>>>       if (test_bit(EXTENT_FLAG_PREALLOC, &em->flags))
> >>>>               goto out;
> >>>> -    /* Physically adjacent and large enough */
> >>>> -    if ((em->block_start + em->block_len == next->block_start) &&
> >>>> -        (em->block_len > SZ_128K && next->block_len > SZ_128K))
> >>>> +    /* Extent is already large enough */
> >>>> +    if (next->len >= extent_thresh)
> >>>> +            goto out;
> >>>
> >>> So this will trigger unnecessary rewrites of compressed extents.
> >>> The SZ_128K is there to deal with compressed extents, it has nothing to
> >>> do with the threshold passed to the ioctl.
> >>
> >> Then there is still something wrong.
> >>
> >> The original check will only reject it when both conditions are met.
> >>
> >> So based on your script, I can still find a way to defrag the extents,
> >> with or without this modification:
> >
> > Right, without the intermediary write to file "baz", this patchset
> > brings a regression in regards to
> > compressed extents - when they are adjacent, which is typically the
> > case when doing large writes,
> > as they'll create multiple extents covering consecutive 128K ranges.
> >
> > With the write to file "baz", as I pasted it, it happens before and
> > after the patchset.
> >
> >>
> >>          mkfs.btrfs -f $DEV
> >>          mount -o compress $DEV $MNT
> >>
> >>          xfs_io -f -c "pwrite -S 0xab 0 128K" $MNT/file1
> >>          sync
> >>          xfs_io -f -c "pwrite -S 0xab 0 128K" $MNT/file2
> >>          sync
> >>          xfs_io -f -c "pwrite -S 0xab 128K 128K" $MNT/file1
> >>          sync
> >>
> >>          echo "=== file1 before defrag ==="
> >>          xfs_io -f -c "fiemap -v" $MNT/file1
> >>          echo "=== file1 after defrag ==="
> >>          btrfs fi defrag $MNT/file1
> >>          sync
> >>          xfs_io -f -c "fiemap -v" $MNT/file1
> >>
> >> The output looks like this:
> >>
> >> === before ===
> >> /mnt/btrfs/file1:
> >>    EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
> >>      0: [0..255]:        26624..26879       256   0x8
> >>      1: [256..511]:      26640..26895       256   0x9
> >> === after ===
> >> /mnt/btrfs/file1:
> >>    EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
> >>      0: [0..255]:        26648..26903       256   0x8
> >>      1: [256..511]:      26656..26911       256   0x9
> >>
> >> No matter if the patch is applied, the result is the same.
> >
> > Yes, explained above.
> >
> >>
> >> So thank you very much for finding another case we're not handling well...
> >>
> >>
> >> BTW, if the check is want to reject adjacent non-compressed extent, the
> >> original one is still incorrect, we can have extents smaller than 128K
> >> and is still uncompressed.
> >>
> >> So what we really want is to reject physically adjacent, non-compressed
> >> extents?
> >
> > We want to avoid doing work that does nothing.
> > If 2 consecutive extents are compressed and at least one is already
> > 128K, then it's a waste of time, IO and CPU.
>
> So can we define the behavior like this?
>
>   If the extent is already at its max capacity (compressed 128K,
>    non-compressed 128M), we don't defrag it.

My previous suggestion was: if one of the extents is compressed and
its size is 128K, don't include it for defrag.

There's probably other cases to think about: 1 compressed extent
representing 100K of data, followed by another compressed extent
representing 64K of data for example.
In that case using both for defrag will still result in 2 extents, 1
for 128K of data and another for 36K of data - still not worth it to
defrag them, we end up with 2 extents, just different sizes.

At the very least we should not regress on what we did not defrag before:

2 extents with physically contiguous ranges, representing 128K of data
each, both compressed.

Which is a very common case.

Thanks.

>
> This also means, we need to do the same check in
> defrag_collect_targets() to avoid defragging such extent.
>
> Thanks,
> Qu
>
>
> >
> > And that's a fairly common scenario. Do a one megabyte write for
> > example, then after writeback we end up with several 128K extents with
> > compression.
> > In that case defrag should do nothing for the whole range.
> >
> >
> >>
> >> Thanks,
> >> Qu
> >>>
> >>> After applying this patchset, if you run a trivial test like this:
> >>>
> >>>      #!/bin/bash
> >>>
> >>>      DEV=/dev/sdj
> >>>      MNT=/mnt/sdj
> >>>
> >>>      mkfs.btrfs -f $DEV
> >>>      mount -o compress $DEV $MNT
> >>>
> >>>      xfs_io -f -c "pwrite -S 0xab 0 128K" $MNT/foobar
> >>>      sync
> >>>      # Write to some other file so that the next extent for foobar
> >>>      # is not contiguous with the first extent.
> >>>      xfs_io -f -c "pwrite 0 128K" $MNT/baz
> >>>      sync
> >>>      xfs_io -f -c "pwrite -S 0xcd 128K 128K" $MNT/foobar
> >>>      sync
> >>>
> >>>      echo -e "\n\nTree after creating file:\n\n"
> >>>      btrfs inspect-internal dump-tree -t 5 $DEV
> >>>
> >>>      btrfs filesystem defragment $MNT/foobar
> >>>      sync
> >>>
> >>>      echo -e "\n\nTree after defrag:\n\n"
> >>>      btrfs inspect-internal dump-tree -t 5 $DEV
> >>>
> >>>      umount $MNT
> >>>
> >>> It will result in rewriting the two 128K compressed extents:
> >>>
> >>> (...)
> >>> Tree after write and sync:
> >>>
> >>> btrfs-progs v5.12.1
> >>> fs tree key (FS_TREE ROOT_ITEM 0)
> >>> (...)
> >>>        item 7 key (257 INODE_REF 256) itemoff 15797 itemsize 16
> >>>                index 2 namelen 6 name: foobar
> >>>        item 8 key (257 EXTENT_DATA 0) itemoff 15744 itemsize 53
> >>>                generation 6 type 1 (regular)
> >>>                extent data disk byte 13631488 nr 4096
> >>>                extent data offset 0 nr 131072 ram 131072
> >>>                extent compression 1 (zlib)
> >>>        item 9 key (257 EXTENT_DATA 131072) itemoff 15691 itemsize 53
> >>>                generation 8 type 1 (regular)
> >>>                extent data disk byte 14163968 nr 4096
> >>>                extent data offset 0 nr 131072 ram 131072
> >>>                extent compression 1 (zlib)
> >>> (...)
> >>>
> >>> Tree after defrag:
> >>>
> >>> btrfs-progs v5.12.1
> >>> fs tree key (FS_TREE ROOT_ITEM 0)
> >>> (...)
> >>>        item 7 key (257 INODE_REF 256) itemoff 15797 itemsize 16
> >>>                index 2 namelen 6 name: foobar
> >>>        item 8 key (257 EXTENT_DATA 0) itemoff 15744 itemsize 53
> >>>                generation 9 type 1 (regular)
> >>>                extent data disk byte 14430208 nr 4096
> >>>                extent data offset 0 nr 131072 ram 131072
> >>>                extent compression 1 (zlib)
> >>>        item 9 key (257 EXTENT_DATA 131072) itemoff 15691 itemsize 53
> >>>                generation 9 type 1 (regular)
> >>>                extent data disk byte 13635584 nr 4096
> >>>                extent data offset 0 nr 131072 ram 131072
> >>>                extent compression 1 (zlib)
> >>>
> >>> In other words, a waste of IO and CPU time.
> >>>
> >>> So it needs to check if we are dealing with compressed extents, and
> >>> if so, skip either of them has a size of SZ_128K (and changelog updated).
> >>>
> >>> Thanks.
> >>>
> >>>> +    /* Physically adjacent */
> >>>> +    if ((em->block_start + em->block_len == next->block_start))
> >>>>               goto out;
> >>>>       ret = true;
> >>>>    out:
> >>>> @@ -1231,7 +1233,7 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
> >>>>                       goto next;
> >>>>
> >>>>               next_mergeable = defrag_check_next_extent(&inode->vfs_inode, em,
> >>>> -                                                      locked);
> >>>> +                                                      extent_thresh, locked);
> >>>>               if (!next_mergeable) {
> >>>>                       struct defrag_target_range *last;
> >>>>
> >>>> --
> >>>> 2.34.1
> >>>>
> >>>
> >>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] btrfs: defrag: use extent_thresh to replace the hardcoded size limit
  2022-01-26 13:37           ` Filipe Manana
@ 2022-01-26 23:57             ` Qu Wenruo
  2022-01-27 10:58               ` Filipe Manana
  0 siblings, 1 reply; 16+ messages in thread
From: Qu Wenruo @ 2022-01-26 23:57 UTC (permalink / raw)
  To: Filipe Manana; +Cc: Qu Wenruo, linux-btrfs



On 2022/1/26 21:37, Filipe Manana wrote:
> On Wed, Jan 26, 2022 at 1:00 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>>
>>
>>
>> On 2022/1/26 20:36, Filipe Manana wrote:
>>> On Wed, Jan 26, 2022 at 12:26 PM Qu Wenruo <wqu@suse.com> wrote:
>>>>
>>>>
>>>>
>>>> On 2022/1/26 19:40, Filipe Manana wrote:
>>>>> On Wed, Jan 26, 2022 at 08:58:49AM +0800, Qu Wenruo wrote:
>>>>>> In defrag_lookup_extent() we use hardcoded extent size threshold, SZ_128K,
>>>>>> other than @extent_thresh in btrfs_defrag_file().
>>>>>>
>>>>>> This can lead to some inconsistent behavior, especially the default
>>>>>> extent size threshold is 256K.
>>>>>>
>>>>>> Fix this by passing @extent_thresh into defrag_check_next_extent() and
>>>>>> use that value.
>>>>>>
>>>>>> Also, since the extent_thresh check should be applied to all extents,
>>>>>> not only physically adjacent extents, move the threshold check into a
>>>>>> dedicate if ().
>>>>>>
>>>>>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>>>>>> ---
>>>>>>     fs/btrfs/ioctl.c | 12 +++++++-----
>>>>>>     1 file changed, 7 insertions(+), 5 deletions(-)
>>>>>>
>>>>>> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
>>>>>> index 0d8bfc716e6b..2911df12fc48 100644
>>>>>> --- a/fs/btrfs/ioctl.c
>>>>>> +++ b/fs/btrfs/ioctl.c
>>>>>> @@ -1050,7 +1050,7 @@ static struct extent_map *defrag_lookup_extent(struct inode *inode, u64 start,
>>>>>>     }
>>>>>>
>>>>>>     static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
>>>>>> -                                 bool locked)
>>>>>> +                                 u32 extent_thresh, bool locked)
>>>>>>     {
>>>>>>        struct extent_map *next;
>>>>>>        bool ret = false;
>>>>>> @@ -1066,9 +1066,11 @@ static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
>>>>>>        /* Preallocated */
>>>>>>        if (test_bit(EXTENT_FLAG_PREALLOC, &em->flags))
>>>>>>                goto out;
>>>>>> -    /* Physically adjacent and large enough */
>>>>>> -    if ((em->block_start + em->block_len == next->block_start) &&
>>>>>> -        (em->block_len > SZ_128K && next->block_len > SZ_128K))
>>>>>> +    /* Extent is already large enough */
>>>>>> +    if (next->len >= extent_thresh)
>>>>>> +            goto out;
>>>>>
>>>>> So this will trigger unnecessary rewrites of compressed extents.
>>>>> The SZ_128K is there to deal with compressed extents, it has nothing to
>>>>> do with the threshold passed to the ioctl.
>>>>
>>>> Then there is still something wrong.
>>>>
>>>> The original check will only reject it when both conditions are met.
>>>>
>>>> So based on your script, I can still find a way to defrag the extents,
>>>> with or without this modification:
>>>
>>> Right, without the intermediary write to file "baz", this patchset
>>> brings a regression in regards to
>>> compressed extents - when they are adjacent, which is typically the
>>> case when doing large writes,
>>> as they'll create multiple extents covering consecutive 128K ranges.
>>>
>>> With the write to file "baz", as I pasted it, it happens before and
>>> after the patchset.
>>>
>>>>
>>>>           mkfs.btrfs -f $DEV
>>>>           mount -o compress $DEV $MNT
>>>>
>>>>           xfs_io -f -c "pwrite -S 0xab 0 128K" $MNT/file1
>>>>           sync
>>>>           xfs_io -f -c "pwrite -S 0xab 0 128K" $MNT/file2
>>>>           sync
>>>>           xfs_io -f -c "pwrite -S 0xab 128K 128K" $MNT/file1
>>>>           sync
>>>>
>>>>           echo "=== file1 before defrag ==="
>>>>           xfs_io -f -c "fiemap -v" $MNT/file1
>>>>           echo "=== file1 after defrag ==="
>>>>           btrfs fi defrag $MNT/file1
>>>>           sync
>>>>           xfs_io -f -c "fiemap -v" $MNT/file1
>>>>
>>>> The output looks like this:
>>>>
>>>> === before ===
>>>> /mnt/btrfs/file1:
>>>>     EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
>>>>       0: [0..255]:        26624..26879       256   0x8
>>>>       1: [256..511]:      26640..26895       256   0x9
>>>> === after ===
>>>> /mnt/btrfs/file1:
>>>>     EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
>>>>       0: [0..255]:        26648..26903       256   0x8
>>>>       1: [256..511]:      26656..26911       256   0x9
>>>>
>>>> No matter if the patch is applied, the result is the same.
>>>
>>> Yes, explained above.
>>>
>>>>
>>>> So thank you very much for finding another case we're not handling well...
>>>>
>>>>
>>>> BTW, if the check is want to reject adjacent non-compressed extent, the
>>>> original one is still incorrect, we can have extents smaller than 128K
>>>> and is still uncompressed.
>>>>
>>>> So what we really want is to reject physically adjacent, non-compressed
>>>> extents?
>>>
>>> We want to avoid doing work that does nothing.
>>> If 2 consecutive extents are compressed and at least one is already
>>> 128K, then it's a waste of time, IO and CPU.
>>
>> So can we define the behavior like this?
>>
>>    If the extent is already at its max capacity (compressed 128K,
>>     non-compressed 128M), we don't defrag it.
>
> My previous suggestion was: if one of the extents is compressed and
> its size is 128K, don't include it for defrag.

Yep, your previous one can handling it well, I'd just want to add the
similar check for uncompressed one (which may be too rare to hit though)

>
> There's probably other cases to think about: 1 compressed extent
> representing 100K of data, followed by another compressed extent
> representing 64K of data for example.
> In that case using both for defrag will still result in 2 extents, 1
> for 128K of data and another for 36K of data - still not worth it to
> defrag them, we end up with 2 extents, just different sizes.

Yes, that's also a factor to consider.

And with the target list we can determine how many compressed extents it
would result.
The missing piece is the number of the original extents.

I think this would be a target for later optimization.

>
> At the very least we should not regress on what we did not defrag before:
>
> 2 extents with physically contiguous ranges, representing 128K of data
> each, both compressed.

Although the original code is not really working as it's doing block_len
 > SZ_128K, while our maximum compressed extent size is exactly 128K.

I'll fix the problem first with better check and comments.

Thanks,
Qu

>
> Which is a very common case.
>
> Thanks.
>
>>
>> This also means, we need to do the same check in
>> defrag_collect_targets() to avoid defragging such extent.
>>
>> Thanks,
>> Qu
>>
>>
>>>
>>> And that's a fairly common scenario. Do a one megabyte write for
>>> example, then after writeback we end up with several 128K extents with
>>> compression.
>>> In that case defrag should do nothing for the whole range.
>>>
>>>
>>>>
>>>> Thanks,
>>>> Qu
>>>>>
>>>>> After applying this patchset, if you run a trivial test like this:
>>>>>
>>>>>       #!/bin/bash
>>>>>
>>>>>       DEV=/dev/sdj
>>>>>       MNT=/mnt/sdj
>>>>>
>>>>>       mkfs.btrfs -f $DEV
>>>>>       mount -o compress $DEV $MNT
>>>>>
>>>>>       xfs_io -f -c "pwrite -S 0xab 0 128K" $MNT/foobar
>>>>>       sync
>>>>>       # Write to some other file so that the next extent for foobar
>>>>>       # is not contiguous with the first extent.
>>>>>       xfs_io -f -c "pwrite 0 128K" $MNT/baz
>>>>>       sync
>>>>>       xfs_io -f -c "pwrite -S 0xcd 128K 128K" $MNT/foobar
>>>>>       sync
>>>>>
>>>>>       echo -e "\n\nTree after creating file:\n\n"
>>>>>       btrfs inspect-internal dump-tree -t 5 $DEV
>>>>>
>>>>>       btrfs filesystem defragment $MNT/foobar
>>>>>       sync
>>>>>
>>>>>       echo -e "\n\nTree after defrag:\n\n"
>>>>>       btrfs inspect-internal dump-tree -t 5 $DEV
>>>>>
>>>>>       umount $MNT
>>>>>
>>>>> It will result in rewriting the two 128K compressed extents:
>>>>>
>>>>> (...)
>>>>> Tree after write and sync:
>>>>>
>>>>> btrfs-progs v5.12.1
>>>>> fs tree key (FS_TREE ROOT_ITEM 0)
>>>>> (...)
>>>>>         item 7 key (257 INODE_REF 256) itemoff 15797 itemsize 16
>>>>>                 index 2 namelen 6 name: foobar
>>>>>         item 8 key (257 EXTENT_DATA 0) itemoff 15744 itemsize 53
>>>>>                 generation 6 type 1 (regular)
>>>>>                 extent data disk byte 13631488 nr 4096
>>>>>                 extent data offset 0 nr 131072 ram 131072
>>>>>                 extent compression 1 (zlib)
>>>>>         item 9 key (257 EXTENT_DATA 131072) itemoff 15691 itemsize 53
>>>>>                 generation 8 type 1 (regular)
>>>>>                 extent data disk byte 14163968 nr 4096
>>>>>                 extent data offset 0 nr 131072 ram 131072
>>>>>                 extent compression 1 (zlib)
>>>>> (...)
>>>>>
>>>>> Tree after defrag:
>>>>>
>>>>> btrfs-progs v5.12.1
>>>>> fs tree key (FS_TREE ROOT_ITEM 0)
>>>>> (...)
>>>>>         item 7 key (257 INODE_REF 256) itemoff 15797 itemsize 16
>>>>>                 index 2 namelen 6 name: foobar
>>>>>         item 8 key (257 EXTENT_DATA 0) itemoff 15744 itemsize 53
>>>>>                 generation 9 type 1 (regular)
>>>>>                 extent data disk byte 14430208 nr 4096
>>>>>                 extent data offset 0 nr 131072 ram 131072
>>>>>                 extent compression 1 (zlib)
>>>>>         item 9 key (257 EXTENT_DATA 131072) itemoff 15691 itemsize 53
>>>>>                 generation 9 type 1 (regular)
>>>>>                 extent data disk byte 13635584 nr 4096
>>>>>                 extent data offset 0 nr 131072 ram 131072
>>>>>                 extent compression 1 (zlib)
>>>>>
>>>>> In other words, a waste of IO and CPU time.
>>>>>
>>>>> So it needs to check if we are dealing with compressed extents, and
>>>>> if so, skip either of them has a size of SZ_128K (and changelog updated).
>>>>>
>>>>> Thanks.
>>>>>
>>>>>> +    /* Physically adjacent */
>>>>>> +    if ((em->block_start + em->block_len == next->block_start))
>>>>>>                goto out;
>>>>>>        ret = true;
>>>>>>     out:
>>>>>> @@ -1231,7 +1233,7 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
>>>>>>                        goto next;
>>>>>>
>>>>>>                next_mergeable = defrag_check_next_extent(&inode->vfs_inode, em,
>>>>>> -                                                      locked);
>>>>>> +                                                      extent_thresh, locked);
>>>>>>                if (!next_mergeable) {
>>>>>>                        struct defrag_target_range *last;
>>>>>>
>>>>>> --
>>>>>> 2.34.1
>>>>>>
>>>>>
>>>>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] btrfs: defrag: use extent_thresh to replace the hardcoded size limit
  2022-01-26 23:57             ` Qu Wenruo
@ 2022-01-27 10:58               ` Filipe Manana
  2022-01-27 11:11                 ` Forza
  0 siblings, 1 reply; 16+ messages in thread
From: Filipe Manana @ 2022-01-27 10:58 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Qu Wenruo, linux-btrfs

On Thu, Jan 27, 2022 at 07:57:32AM +0800, Qu Wenruo wrote:
> 
> 
> On 2022/1/26 21:37, Filipe Manana wrote:
> > On Wed, Jan 26, 2022 at 1:00 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
> > > 
> > > 
> > > 
> > > On 2022/1/26 20:36, Filipe Manana wrote:
> > > > On Wed, Jan 26, 2022 at 12:26 PM Qu Wenruo <wqu@suse.com> wrote:
> > > > > 
> > > > > 
> > > > > 
> > > > > On 2022/1/26 19:40, Filipe Manana wrote:
> > > > > > On Wed, Jan 26, 2022 at 08:58:49AM +0800, Qu Wenruo wrote:
> > > > > > > In defrag_lookup_extent() we use hardcoded extent size threshold, SZ_128K,
> > > > > > > other than @extent_thresh in btrfs_defrag_file().
> > > > > > > 
> > > > > > > This can lead to some inconsistent behavior, especially the default
> > > > > > > extent size threshold is 256K.
> > > > > > > 
> > > > > > > Fix this by passing @extent_thresh into defrag_check_next_extent() and
> > > > > > > use that value.
> > > > > > > 
> > > > > > > Also, since the extent_thresh check should be applied to all extents,
> > > > > > > not only physically adjacent extents, move the threshold check into a
> > > > > > > dedicate if ().
> > > > > > > 
> > > > > > > Signed-off-by: Qu Wenruo <wqu@suse.com>
> > > > > > > ---
> > > > > > >     fs/btrfs/ioctl.c | 12 +++++++-----
> > > > > > >     1 file changed, 7 insertions(+), 5 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> > > > > > > index 0d8bfc716e6b..2911df12fc48 100644
> > > > > > > --- a/fs/btrfs/ioctl.c
> > > > > > > +++ b/fs/btrfs/ioctl.c
> > > > > > > @@ -1050,7 +1050,7 @@ static struct extent_map *defrag_lookup_extent(struct inode *inode, u64 start,
> > > > > > >     }
> > > > > > > 
> > > > > > >     static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
> > > > > > > -                                 bool locked)
> > > > > > > +                                 u32 extent_thresh, bool locked)
> > > > > > >     {
> > > > > > >        struct extent_map *next;
> > > > > > >        bool ret = false;
> > > > > > > @@ -1066,9 +1066,11 @@ static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
> > > > > > >        /* Preallocated */
> > > > > > >        if (test_bit(EXTENT_FLAG_PREALLOC, &em->flags))
> > > > > > >                goto out;
> > > > > > > -    /* Physically adjacent and large enough */
> > > > > > > -    if ((em->block_start + em->block_len == next->block_start) &&
> > > > > > > -        (em->block_len > SZ_128K && next->block_len > SZ_128K))
> > > > > > > +    /* Extent is already large enough */
> > > > > > > +    if (next->len >= extent_thresh)
> > > > > > > +            goto out;
> > > > > > 
> > > > > > So this will trigger unnecessary rewrites of compressed extents.
> > > > > > The SZ_128K is there to deal with compressed extents, it has nothing to
> > > > > > do with the threshold passed to the ioctl.
> > > > > 
> > > > > Then there is still something wrong.
> > > > > 
> > > > > The original check will only reject it when both conditions are met.
> > > > > 
> > > > > So based on your script, I can still find a way to defrag the extents,
> > > > > with or without this modification:
> > > > 
> > > > Right, without the intermediary write to file "baz", this patchset
> > > > brings a regression in regards to
> > > > compressed extents - when they are adjacent, which is typically the
> > > > case when doing large writes,
> > > > as they'll create multiple extents covering consecutive 128K ranges.
> > > > 
> > > > With the write to file "baz", as I pasted it, it happens before and
> > > > after the patchset.
> > > > 
> > > > > 
> > > > >           mkfs.btrfs -f $DEV
> > > > >           mount -o compress $DEV $MNT
> > > > > 
> > > > >           xfs_io -f -c "pwrite -S 0xab 0 128K" $MNT/file1
> > > > >           sync
> > > > >           xfs_io -f -c "pwrite -S 0xab 0 128K" $MNT/file2
> > > > >           sync
> > > > >           xfs_io -f -c "pwrite -S 0xab 128K 128K" $MNT/file1
> > > > >           sync
> > > > > 
> > > > >           echo "=== file1 before defrag ==="
> > > > >           xfs_io -f -c "fiemap -v" $MNT/file1
> > > > >           echo "=== file1 after defrag ==="
> > > > >           btrfs fi defrag $MNT/file1
> > > > >           sync
> > > > >           xfs_io -f -c "fiemap -v" $MNT/file1
> > > > > 
> > > > > The output looks like this:
> > > > > 
> > > > > === before ===
> > > > > /mnt/btrfs/file1:
> > > > >     EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
> > > > >       0: [0..255]:        26624..26879       256   0x8
> > > > >       1: [256..511]:      26640..26895       256   0x9
> > > > > === after ===
> > > > > /mnt/btrfs/file1:
> > > > >     EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
> > > > >       0: [0..255]:        26648..26903       256   0x8
> > > > >       1: [256..511]:      26656..26911       256   0x9
> > > > > 
> > > > > No matter if the patch is applied, the result is the same.
> > > > 
> > > > Yes, explained above.
> > > > 
> > > > > 
> > > > > So thank you very much for finding another case we're not handling well...
> > > > > 
> > > > > 
> > > > > BTW, if the check is want to reject adjacent non-compressed extent, the
> > > > > original one is still incorrect, we can have extents smaller than 128K
> > > > > and is still uncompressed.
> > > > > 
> > > > > So what we really want is to reject physically adjacent, non-compressed
> > > > > extents?
> > > > 
> > > > We want to avoid doing work that does nothing.
> > > > If 2 consecutive extents are compressed and at least one is already
> > > > 128K, then it's a waste of time, IO and CPU.
> > > 
> > > So can we define the behavior like this?
> > > 
> > >    If the extent is already at its max capacity (compressed 128K,
> > >     non-compressed 128M), we don't defrag it.
> > 
> > My previous suggestion was: if one of the extents is compressed and
> > its size is 128K, don't include it for defrag.
> 
> Yep, your previous one can handling it well, I'd just want to add the
> similar check for uncompressed one (which may be too rare to hit though)
> 
> > 
> > There's probably other cases to think about: 1 compressed extent
> > representing 100K of data, followed by another compressed extent
> > representing 64K of data for example.
> > In that case using both for defrag will still result in 2 extents, 1
> > for 128K of data and another for 36K of data - still not worth it to
> > defrag them, we end up with 2 extents, just different sizes.
> 
> Yes, that's also a factor to consider.
> 
> And with the target list we can determine how many compressed extents it
> would result.
> The missing piece is the number of the original extents.
> 
> I think this would be a target for later optimization.

Sure. Don't forget about possibly less common cases like having a mix
of compressed extents, smaller than 128K followed with non-compressed
extents, with sizes like in that above example, where after defrag
we would still end up with 2 extents, just of different sizes.

It's certainly possible after cloning from a compressed file to one
without compression and vice-versa.

> 
> > 
> > At the very least we should not regress on what we did not defrag before:
> > 
> > 2 extents with physically contiguous ranges, representing 128K of data
> > each, both compressed.
> 
> Although the original code is not really working as it's doing block_len
> > SZ_128K, while our maximum compressed extent size is exactly 128K.
> 
> I'll fix the problem first with better check and comments.

Sounds fine.

Thanks.

> 
> Thanks,
> Qu
> 
> > 
> > Which is a very common case.
> > 
> > Thanks.
> > 
> > > 
> > > This also means, we need to do the same check in
> > > defrag_collect_targets() to avoid defragging such extent.
> > > 
> > > Thanks,
> > > Qu
> > > 
> > > 
> > > > 
> > > > And that's a fairly common scenario. Do a one megabyte write for
> > > > example, then after writeback we end up with several 128K extents with
> > > > compression.
> > > > In that case defrag should do nothing for the whole range.
> > > > 
> > > > 
> > > > > 
> > > > > Thanks,
> > > > > Qu
> > > > > > 
> > > > > > After applying this patchset, if you run a trivial test like this:
> > > > > > 
> > > > > >       #!/bin/bash
> > > > > > 
> > > > > >       DEV=/dev/sdj
> > > > > >       MNT=/mnt/sdj
> > > > > > 
> > > > > >       mkfs.btrfs -f $DEV
> > > > > >       mount -o compress $DEV $MNT
> > > > > > 
> > > > > >       xfs_io -f -c "pwrite -S 0xab 0 128K" $MNT/foobar
> > > > > >       sync
> > > > > >       # Write to some other file so that the next extent for foobar
> > > > > >       # is not contiguous with the first extent.
> > > > > >       xfs_io -f -c "pwrite 0 128K" $MNT/baz
> > > > > >       sync
> > > > > >       xfs_io -f -c "pwrite -S 0xcd 128K 128K" $MNT/foobar
> > > > > >       sync
> > > > > > 
> > > > > >       echo -e "\n\nTree after creating file:\n\n"
> > > > > >       btrfs inspect-internal dump-tree -t 5 $DEV
> > > > > > 
> > > > > >       btrfs filesystem defragment $MNT/foobar
> > > > > >       sync
> > > > > > 
> > > > > >       echo -e "\n\nTree after defrag:\n\n"
> > > > > >       btrfs inspect-internal dump-tree -t 5 $DEV
> > > > > > 
> > > > > >       umount $MNT
> > > > > > 
> > > > > > It will result in rewriting the two 128K compressed extents:
> > > > > > 
> > > > > > (...)
> > > > > > Tree after write and sync:
> > > > > > 
> > > > > > btrfs-progs v5.12.1
> > > > > > fs tree key (FS_TREE ROOT_ITEM 0)
> > > > > > (...)
> > > > > >         item 7 key (257 INODE_REF 256) itemoff 15797 itemsize 16
> > > > > >                 index 2 namelen 6 name: foobar
> > > > > >         item 8 key (257 EXTENT_DATA 0) itemoff 15744 itemsize 53
> > > > > >                 generation 6 type 1 (regular)
> > > > > >                 extent data disk byte 13631488 nr 4096
> > > > > >                 extent data offset 0 nr 131072 ram 131072
> > > > > >                 extent compression 1 (zlib)
> > > > > >         item 9 key (257 EXTENT_DATA 131072) itemoff 15691 itemsize 53
> > > > > >                 generation 8 type 1 (regular)
> > > > > >                 extent data disk byte 14163968 nr 4096
> > > > > >                 extent data offset 0 nr 131072 ram 131072
> > > > > >                 extent compression 1 (zlib)
> > > > > > (...)
> > > > > > 
> > > > > > Tree after defrag:
> > > > > > 
> > > > > > btrfs-progs v5.12.1
> > > > > > fs tree key (FS_TREE ROOT_ITEM 0)
> > > > > > (...)
> > > > > >         item 7 key (257 INODE_REF 256) itemoff 15797 itemsize 16
> > > > > >                 index 2 namelen 6 name: foobar
> > > > > >         item 8 key (257 EXTENT_DATA 0) itemoff 15744 itemsize 53
> > > > > >                 generation 9 type 1 (regular)
> > > > > >                 extent data disk byte 14430208 nr 4096
> > > > > >                 extent data offset 0 nr 131072 ram 131072
> > > > > >                 extent compression 1 (zlib)
> > > > > >         item 9 key (257 EXTENT_DATA 131072) itemoff 15691 itemsize 53
> > > > > >                 generation 9 type 1 (regular)
> > > > > >                 extent data disk byte 13635584 nr 4096
> > > > > >                 extent data offset 0 nr 131072 ram 131072
> > > > > >                 extent compression 1 (zlib)
> > > > > > 
> > > > > > In other words, a waste of IO and CPU time.
> > > > > > 
> > > > > > So it needs to check if we are dealing with compressed extents, and
> > > > > > if so, skip either of them has a size of SZ_128K (and changelog updated).
> > > > > > 
> > > > > > Thanks.
> > > > > > 
> > > > > > > +    /* Physically adjacent */
> > > > > > > +    if ((em->block_start + em->block_len == next->block_start))
> > > > > > >                goto out;
> > > > > > >        ret = true;
> > > > > > >     out:
> > > > > > > @@ -1231,7 +1233,7 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
> > > > > > >                        goto next;
> > > > > > > 
> > > > > > >                next_mergeable = defrag_check_next_extent(&inode->vfs_inode, em,
> > > > > > > -                                                      locked);
> > > > > > > +                                                      extent_thresh, locked);
> > > > > > >                if (!next_mergeable) {
> > > > > > >                        struct defrag_target_range *last;
> > > > > > > 
> > > > > > > --
> > > > > > > 2.34.1
> > > > > > > 
> > > > > > 
> > > > > 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] btrfs: defrag: use extent_thresh to replace the hardcoded size limit
  2022-01-27 10:58               ` Filipe Manana
@ 2022-01-27 11:11                 ` Forza
  0 siblings, 0 replies; 16+ messages in thread
From: Forza @ 2022-01-27 11:11 UTC (permalink / raw)
  To: linux-btrfs Mailinglist



---- From: Filipe Manana <fdmanana@kernel.org> -- Sent: 2022-01-27 - 11:58 ----

> On Thu, Jan 27, 2022 at 07:57:32AM +0800, Qu Wenruo wrote:
>> 
>> 
>> On 2022/1/26 21:37, Filipe Manana wrote:
>> > On Wed, Jan 26, 2022 at 1:00 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>> > > 
>> > > 
>> > > 
>> > > On 2022/1/26 20:36, Filipe Manana wrote:
>> > > > On Wed, Jan 26, 2022 at 12:26 PM Qu Wenruo <wqu@suse.com> wrote:
>> > > > > 
>> > > > > 
>> > > > > 
>> > > > > On 2022/1/26 19:40, Filipe Manana wrote:
>> > > > > > On Wed, Jan 26, 2022 at 08:58:49AM +0800, Qu Wenruo wrote:
>> > > > > > > In defrag_lookup_extent() we use hardcoded extent size threshold, SZ_128K,
>> > > > > > > other than @extent_thresh in btrfs_defrag_file().
>> > > > > > > 
>> > > > > > > This can lead to some inconsistent behavior, especially the default
>> > > > > > > extent size threshold is 256K.
>> > > > > > > 
>> > > > > > > Fix this by passing @extent_thresh into defrag_check_next_extent() and
>> > > > > > > use that value.
>> > > > > > > 
>> > > > > > > Also, since the extent_thresh check should be applied to all extents,
>> > > > > > > not only physically adjacent extents, move the threshold check into a
>> > > > > > > dedicate if ().
>> > > > > > > 
>> > > > > > > Signed-off-by: Qu Wenruo <wqu@suse.com>
>> > > > > > > ---
>> > > > > > >     fs/btrfs/ioctl.c | 12 +++++++-----
>> > > > > > >     1 file changed, 7 insertions(+), 5 deletions(-)
>> > > > > > > 
>> > > > > > > diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
>> > > > > > > index 0d8bfc716e6b..2911df12fc48 100644
>> > > > > > > --- a/fs/btrfs/ioctl.c
>> > > > > > > +++ b/fs/btrfs/ioctl.c
>> > > > > > > @@ -1050,7 +1050,7 @@ static struct extent_map *defrag_lookup_extent(struct inode *inode, u64 start,
>> > > > > > >     }
>> > > > > > > 
>> > > > > > >     static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
>> > > > > > > -                                 bool locked)
>> > > > > > > +                                 u32 extent_thresh, bool locked)
>> > > > > > >     {
>> > > > > > >        struct extent_map *next;
>> > > > > > >        bool ret = false;
>> > > > > > > @@ -1066,9 +1066,11 @@ static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
>> > > > > > >        /* Preallocated */
>> > > > > > >        if (test_bit(EXTENT_FLAG_PREALLOC, &em->flags))
>> > > > > > >                goto out;
>> > > > > > > -    /* Physically adjacent and large enough */
>> > > > > > > -    if ((em->block_start + em->block_len == next->block_start) &&
>> > > > > > > -        (em->block_len > SZ_128K && next->block_len > SZ_128K))
>> > > > > > > +    /* Extent is already large enough */
>> > > > > > > +    if (next->len >= extent_thresh)
>> > > > > > > +            goto out;
>> > > > > > 
>> > > > > > So this will trigger unnecessary rewrites of compressed extents.
>> > > > > > The SZ_128K is there to deal with compressed extents, it has nothing to
>> > > > > > do with the threshold passed to the ioctl.
>> > > > > 
>> > > > > Then there is still something wrong.
>> > > > > 
>> > > > > The original check will only reject it when both conditions are met.
>> > > > > 
>> > > > > So based on your script, I can still find a way to defrag the extents,
>> > > > > with or without this modification:
>> > > > 
>> > > > Right, without the intermediary write to file "baz", this patchset
>> > > > brings a regression in regards to
>> > > > compressed extents - when they are adjacent, which is typically the
>> > > > case when doing large writes,
>> > > > as they'll create multiple extents covering consecutive 128K ranges.
>> > > > 
>> > > > With the write to file "baz", as I pasted it, it happens before and
>> > > > after the patchset.
>> > > > 
>> > > > > 
>> > > > >           mkfs.btrfs -f $DEV
>> > > > >           mount -o compress $DEV $MNT
>> > > > > 
>> > > > >           xfs_io -f -c "pwrite -S 0xab 0 128K" $MNT/file1
>> > > > >           sync
>> > > > >           xfs_io -f -c "pwrite -S 0xab 0 128K" $MNT/file2
>> > > > >           sync
>> > > > >           xfs_io -f -c "pwrite -S 0xab 128K 128K" $MNT/file1
>> > > > >           sync
>> > > > > 
>> > > > >           echo "=== file1 before defrag ==="
>> > > > >           xfs_io -f -c "fiemap -v" $MNT/file1
>> > > > >           echo "=== file1 after defrag ==="
>> > > > >           btrfs fi defrag $MNT/file1
>> > > > >           sync
>> > > > >           xfs_io -f -c "fiemap -v" $MNT/file1
>> > > > > 
>> > > > > The output looks like this:
>> > > > > 
>> > > > > === before ===
>> > > > > /mnt/btrfs/file1:
>> > > > >     EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
>> > > > >       0: [0..255]:        26624..26879       256   0x8
>> > > > >       1: [256..511]:      26640..26895       256   0x9
>> > > > > === after ===
>> > > > > /mnt/btrfs/file1:
>> > > > >     EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
>> > > > >       0: [0..255]:        26648..26903       256   0x8
>> > > > >       1: [256..511]:      26656..26911       256   0x9
>> > > > > 
>> > > > > No matter if the patch is applied, the result is the same.
>> > > > 
>> > > > Yes, explained above.
>> > > > 
>> > > > > 
>> > > > > So thank you very much for finding another case we're not handling well...
>> > > > > 
>> > > > > 
>> > > > > BTW, if the check is want to reject adjacent non-compressed extent, the
>> > > > > original one is still incorrect, we can have extents smaller than 128K
>> > > > > and is still uncompressed.
>> > > > > 
>> > > > > So what we really want is to reject physically adjacent, non-compressed
>> > > > > extents?
>> > > > 
>> > > > We want to avoid doing work that does nothing.
>> > > > If 2 consecutive extents are compressed and at least one is already
>> > > > 128K, then it's a waste of time, IO and CPU.
>> > > 
>> > > So can we define the behavior like this?
>> > > 
>> > >    If the extent is already at its max capacity (compressed 128K,
>> > >     non-compressed 128M), we don't defrag it.
>> > 
>> > My previous suggestion was: if one of the extents is compressed and
>> > its size is 128K, don't include it for defrag.
>> 
>> Yep, your previous one can handling it well, I'd just want to add the
>> similar check for uncompressed one (which may be too rare to hit though)
>> 
>> > 
>> > There's probably other cases to think about: 1 compressed extent
>> > representing 100K of data, followed by another compressed extent
>> > representing 64K of data for example.
>> > In that case using both for defrag will still result in 2 extents, 1
>> > for 128K of data and another for 36K of data - still not worth it to
>> > defrag them, we end up with 2 extents, just different sizes.
>> 
>> Yes, that's also a factor to consider.
>> 
>> And with the target list we can determine how many compressed extents it
>> would result.
>> The missing piece is the number of the original extents.
>> 
>> I think this would be a target for later optimization.
> 
> Sure. Don't forget about possibly less common cases like having a mix
> of compressed extents, smaller than 128K followed with non-compressed
> extents, with sizes like in that above example, where after defrag
> we would still end up with 2 extents, just of different sizes.
> 
> It's certainly possible after cloning from a compressed file to one
> without compression and vice-versa.
> 


It may also be possible that two compressed extents with a combined size >128K would compress better when combined into a single extent, than each of them individually, and so end up fitting inside one final 128K extent. 

How advanced logic is reasonable to aim for? 

>> 
>> > 
>> > At the very least we should not regress on what we did not defrag before:
>> > 
>> > 2 extents with physically contiguous ranges, representing 128K of data
>> > each, both compressed.
>> 
>> Although the original code is not really working as it's doing block_len
>> > SZ_128K, while our maximum compressed extent size is exactly 128K.
>> 
>> I'll fix the problem first with better check and comments.
> 
> Sounds fine.
> 
> Thanks.
> 
>> 
>> Thanks,
>> Qu
>> 
>> > 
>> > Which is a very common case.
>> > 
>> > Thanks.
>> > 
>> > > 
>> > > This also means, we need to do the same check in
>> > > defrag_collect_targets() to avoid defragging such extent.
>> > > 
>> > > Thanks,
>> > > Qu
>> > > 
>> > > 
>> > > > 
>> > > > And that's a fairly common scenario. Do a one megabyte write for
>> > > > example, then after writeback we end up with several 128K extents with
>> > > > compression.
>> > > > In that case defrag should do nothing for the whole range.
>> > > > 
>> > > > 
>> > > > > 
>> > > > > Thanks,
>> > > > > Qu
>> > > > > > 
>> > > > > > After applying this patchset, if you run a trivial test like this:
>> > > > > > 
>> > > > > >       #!/bin/bash
>> > > > > > 
>> > > > > >       DEV=/dev/sdj
>> > > > > >       MNT=/mnt/sdj
>> > > > > > 
>> > > > > >       mkfs.btrfs -f $DEV
>> > > > > >       mount -o compress $DEV $MNT
>> > > > > > 
>> > > > > >       xfs_io -f -c "pwrite -S 0xab 0 128K" $MNT/foobar
>> > > > > >       sync
>> > > > > >       # Write to some other file so that the next extent for foobar
>> > > > > >       # is not contiguous with the first extent.
>> > > > > >       xfs_io -f -c "pwrite 0 128K" $MNT/baz
>> > > > > >       sync
>> > > > > >       xfs_io -f -c "pwrite -S 0xcd 128K 128K" $MNT/foobar
>> > > > > >       sync
>> > > > > > 
>> > > > > >       echo -e "\n\nTree after creating file:\n\n"
>> > > > > >       btrfs inspect-internal dump-tree -t 5 $DEV
>> > > > > > 
>> > > > > >       btrfs filesystem defragment $MNT/foobar
>> > > > > >       sync
>> > > > > > 
>> > > > > >       echo -e "\n\nTree after defrag:\n\n"
>> > > > > >       btrfs inspect-internal dump-tree -t 5 $DEV
>> > > > > > 
>> > > > > >       umount $MNT
>> > > > > > 
>> > > > > > It will result in rewriting the two 128K compressed extents:
>> > > > > > 
>> > > > > > (...)
>> > > > > > Tree after write and sync:
>> > > > > > 
>> > > > > > btrfs-progs v5.12.1
>> > > > > > fs tree key (FS_TREE ROOT_ITEM 0)
>> > > > > > (...)
>> > > > > >         item 7 key (257 INODE_REF 256) itemoff 15797 itemsize 16
>> > > > > >                 index 2 namelen 6 name: foobar
>> > > > > >         item 8 key (257 EXTENT_DATA 0) itemoff 15744 itemsize 53
>> > > > > >                 generation 6 type 1 (regular)
>> > > > > >                 extent data disk byte 13631488 nr 4096
>> > > > > >                 extent data offset 0 nr 131072 ram 131072
>> > > > > >                 extent compression 1 (zlib)
>> > > > > >         item 9 key (257 EXTENT_DATA 131072) itemoff 15691 itemsize 53
>> > > > > >                 generation 8 type 1 (regular)
>> > > > > >                 extent data disk byte 14163968 nr 4096
>> > > > > >                 extent data offset 0 nr 131072 ram 131072
>> > > > > >                 extent compression 1 (zlib)
>> > > > > > (...)
>> > > > > > 
>> > > > > > Tree after defrag:
>> > > > > > 
>> > > > > > btrfs-progs v5.12.1
>> > > > > > fs tree key (FS_TREE ROOT_ITEM 0)
>> > > > > > (...)
>> > > > > >         item 7 key (257 INODE_REF 256) itemoff 15797 itemsize 16
>> > > > > >                 index 2 namelen 6 name: foobar
>> > > > > >         item 8 key (257 EXTENT_DATA 0) itemoff 15744 itemsize 53
>> > > > > >                 generation 9 type 1 (regular)
>> > > > > >                 extent data disk byte 14430208 nr 4096
>> > > > > >                 extent data offset 0 nr 131072 ram 131072
>> > > > > >                 extent compression 1 (zlib)
>> > > > > >         item 9 key (257 EXTENT_DATA 131072) itemoff 15691 itemsize 53
>> > > > > >                 generation 9 type 1 (regular)
>> > > > > >                 extent data disk byte 13635584 nr 4096
>> > > > > >                 extent data offset 0 nr 131072 ram 131072
>> > > > > >                 extent compression 1 (zlib)
>> > > > > > 
>> > > > > > In other words, a waste of IO and CPU time.
>> > > > > > 
>> > > > > > So it needs to check if we are dealing with compressed extents, and
>> > > > > > if so, skip either of them has a size of SZ_128K (and changelog updated).
>> > > > > > 
>> > > > > > Thanks.
>> > > > > > 
>> > > > > > > +    /* Physically adjacent */
>> > > > > > > +    if ((em->block_start + em->block_len == next->block_start))
>> > > > > > >                goto out;
>> > > > > > >        ret = true;
>> > > > > > >     out:
>> > > > > > > @@ -1231,7 +1233,7 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
>> > > > > > >                        goto next;
>> > > > > > > 
>> > > > > > >                next_mergeable = defrag_check_next_extent(&inode->vfs_inode, em,
>> > > > > > > -                                                      locked);
>> > > > > > > +                                                      extent_thresh, locked);
>> > > > > > >                if (!next_mergeable) {
>> > > > > > >                        struct defrag_target_range *last;
>> > > > > > > 
>> > > > > > > --
>> > > > > > > 2.34.1
>> > > > > > > 
>> > > > > > 
>> > > > > 



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 1/3] btrfs: defrag: don't try to merge regular extents with preallocated extents
  2022-01-26  0:58 [PATCH v3 1/3] btrfs: defrag: don't try to merge regular extents with preallocated extents Qu Wenruo
                   ` (2 preceding siblings ...)
  2022-01-26 11:26 ` [PATCH v3 1/3] btrfs: defrag: don't try to merge regular extents with preallocated extents Filipe Manana
@ 2022-01-28  6:31 ` Qu Wenruo
  3 siblings, 0 replies; 16+ messages in thread
From: Qu Wenruo @ 2022-01-28  6:31 UTC (permalink / raw)
  To: linux-btrfs



On 2022/1/26 08:58, Qu Wenruo wrote:
> [BUG]
> With older kernels (before v5.16), btrfs will defrag preallocated extents.
> While with newer kernels (v5.16 and newer) btrfs will not defrag
> preallocated extents, but it will defrag the extent just before the
> preallocated extent, even it's just a single sector.
> 
> This can be exposed by the following small script:
> 
> 	mkfs.btrfs -f $dev > /dev/null
> 
> 	mount $dev $mnt
> 	xfs_io -f -c "pwrite 0 4k" -c sync -c "falloc 4k 16K" $mnt/file
> 	xfs_io -c "fiemap -v" $mnt/file
> 	btrfs fi defrag $mnt/file
> 	sync
> 	xfs_io -c "fiemap -v" $mnt/file
> 
> The output looks like this on older kernels:
> 
> /mnt/btrfs/file:
>   EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
>     0: [0..7]:          26624..26631         8   0x0
>     1: [8..39]:         26632..26663        32 0x801
> /mnt/btrfs/file:
>   EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
>     0: [0..39]:         26664..26703        40   0x1
> 
> Which defrags the single sector along with the preallocated extent, and
> replace them with an regular extent into a new location (caused by data
> COW).
> This wastes most of the data IO just for the preallocated range.
> 
> On the other hand, v5.16 is slightly better:
> 
> /mnt/btrfs/file:
>   EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
>     0: [0..7]:          26624..26631         8   0x0
>     1: [8..39]:         26632..26663        32 0x801
> /mnt/btrfs/file:
>   EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
>     0: [0..7]:          26664..26671         8   0x0
>     1: [8..39]:         26632..26663        32 0x801
> 
> The preallocated range is not defragged, but the sector before it still
> gets defragged, which has no need for it.
> 
> [CAUSE]
> One of the function reused by the old and new behavior is
> defrag_check_next_extent(), it will determine if we should defrag
> current extent by checking the next one.
> 
> It only checks if the next extent is a hole or inlined, but it doesn't
> check if it's preallocated.
> 
> On the other hand, out of the function, both old and new kernel will
> reject preallocated extents.
> 
> Such inconsistent behavior causes above behavior.
> 
> [FIX]
> - Also check if next extent is preallocated
>    If so, don't defrag current extent.
> 
> - Add comments for each branch why we reject the extent
> 
> This will reduce the IO caused by defrag ioctl and autodefrag.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> Changelog:
> v2:
> - Use @extent_thresh from caller to replace the harded coded threshold
>    Now caller has full control over the extent threshold value.
> 
> - Remove the old ambiguous check based on physical address
>    The original check is too specific, only reject extents which are
>    physically adjacent, AND too large.
>    Since we have correct size check now, and the physically adjacent check
>    is not always a win.
>    So remove the old check completely.
> 
> v3:
> - Split the @extent_thresh and physicall adjacent check into other
>    patches
> 
> - Simplify the comment
> ---
>   fs/btrfs/ioctl.c | 20 +++++++++++++-------
>   1 file changed, 13 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 91ba2efe9792..0d8bfc716e6b 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -1053,19 +1053,25 @@ static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
>   				     bool locked)
>   {
>   	struct extent_map *next;
> -	bool ret = true;
> +	bool ret = false;
>   
>   	/* this is the last extent */
>   	if (em->start + em->len >= i_size_read(inode))
> -		return false;
> +		return ret;
>   
>   	next = defrag_lookup_extent(inode, em->start + em->len, locked);
> +	/* No more em or hole */
>   	if (!next || next->block_start >= EXTENT_MAP_LAST_BYTE)
> -		ret = false;
> -	else if ((em->block_start + em->block_len == next->block_start) &&
> -		 (em->block_len > SZ_128K && next->block_len > SZ_128K))
> -		ret = false;
> -
> +		goto out;
> +	/* Preallocated */
> +	if (test_bit(EXTENT_FLAG_PREALLOC, &em->flags))

Nope, during the extra tests for unrelated work, I exposed this bug, we 
should check @next, not @em.

Needs to update it again...

Thanks,
Qu
> +		goto out;
> +	/* Physically adjacent and large enough */
> +	if ((em->block_start + em->block_len == next->block_start) &&
> +	    (em->block_len > SZ_128K && next->block_len > SZ_128K))
> +		goto out;
> +	ret = true;
> +out:
>   	free_extent_map(next);
>   	return ret;
>   }


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-01-28  6:32 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-26  0:58 [PATCH v3 1/3] btrfs: defrag: don't try to merge regular extents with preallocated extents Qu Wenruo
2022-01-26  0:58 ` [PATCH 2/3] btrfs: defrag: use extent_thresh to replace the hardcoded size limit Qu Wenruo
2022-01-26 11:40   ` Filipe Manana
2022-01-26 12:26     ` Qu Wenruo
2022-01-26 12:36       ` Filipe Manana
2022-01-26 13:00         ` Qu Wenruo
2022-01-26 13:37           ` Filipe Manana
2022-01-26 23:57             ` Qu Wenruo
2022-01-27 10:58               ` Filipe Manana
2022-01-27 11:11                 ` Forza
2022-01-26  0:58 ` [PATCH 3/3] btrfs: defrag: remove the physical adjacent extents rejection in defrag_check_next_extent() Qu Wenruo
2022-01-26 11:44   ` Filipe Manana
2022-01-26 11:26 ` [PATCH v3 1/3] btrfs: defrag: don't try to merge regular extents with preallocated extents Filipe Manana
2022-01-26 11:33   ` Qu Wenruo
2022-01-26 11:47     ` Filipe Manana
2022-01-28  6:31 ` Qu Wenruo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.