[PATCH v2] btrfs: defrag: fix the wrong number of defragged sectors

* [PATCH v2] btrfs: defrag: fix the wrong number of defragged sectors
@ 2022-01-18  7:19 Qu Wenruo
  2022-01-18 14:33 ` Filipe Manana
  2022-01-19 17:18 ` David Sterba
  0 siblings, 2 replies; 4+ messages in thread
From: Qu Wenruo @ 2022-01-18  7:19 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Anthony Ruhier

[BUG]
There are users using autodefrag mount option reporting obvious increase
in IO:

> If I compare the write average (in total, I don't have it per process)
> when taking idle periods on the same machine:
>     Linux 5.16:
>         without autodefrag: ~ 10KiB/s
>         with autodefrag: between 1 and 2MiB/s.
>
>     Linux 5.15:
>         with autodefrag:~ 10KiB/s (around the same as without
> autodefrag on 5.16)

[CAUSE]
When autodefrag mount option is enabled, btrfs_defrag_file() will be
called with @max_sectors = BTRFS_DEFRAG_BATCH (1024) to limit how many
sectors we can defrag in one try.

And then use the number of sectors defragged to determine if we need to
re-defrag.

But commit b18c3ab2343d ("btrfs: defrag: introduce helper to defrag one
cluster") uses wrong unit to increase @sectors_defragged, which should
be in unit of sector, not byte.

This means, if we have defragged any sector, then @sectors_defragged
will be >= sectorsize (normally 4096), which is larger than
BTRFS_DEFRAG_BATCH.

This makes the @max_sectors check in defrag_one_cluster() to underflow,
rendering the whole @max_sectors check useless.

Thus causing way more IO for autodefrag mount options, as now there is
no limit on how many sectors can really be defragged.

[FIX]
Fix the problems by:

- Use sector as unit when increaseing @sectors_defragged

- Include @sectors_defragged > @max_sectors case to break the loop

- Add extra comment on the return value of btrfs_defrag_file()

Reported-by: Anthony Ruhier <aruhier@mailbox.org>
Fixes: b18c3ab2343d ("btrfs: defrag: introduce helper to defrag one cluster")
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
Changelog:
v2:
- Update the commit message to include the root cause of extra IO

- Keep @sectors_defragged update where it is
  Since the over-reported @sectors_defragged is not the real reason,
  keep the patch smaller.
---
 fs/btrfs/ioctl.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 6ad2bc2e5af3..2a77273d91fe 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1442,8 +1442,8 @@ static int defrag_one_cluster(struct btrfs_inode *inode,
 	list_for_each_entry(entry, &target_list, list) {
 		u32 range_len = entry->len;
 
-		/* Reached the limit */
-		if (max_sectors && max_sectors == *sectors_defragged)
+		/* Reached or beyond the limit */
+		if (max_sectors && *sectors_defragged >= max_sectors)
 			break;
 
 		if (max_sectors)
@@ -1465,7 +1465,8 @@ static int defrag_one_cluster(struct btrfs_inode *inode,
 				       extent_thresh, newer_than, do_compress);
 		if (ret < 0)
 			break;
-		*sectors_defragged += range_len;
+		*sectors_defragged += range_len >>
+				      inode->root->fs_info->sectorsize_bits;
 	}
 out:
 	list_for_each_entry_safe(entry, tmp, &target_list, list) {
@@ -1484,6 +1485,9 @@ static int defrag_one_cluster(struct btrfs_inode *inode,
  * @newer_than:	   minimum transid to defrag
  * @max_to_defrag: max number of sectors to be defragged, if 0, the whole inode
  *		   will be defragged.
+ *
+ * Return <0 for error.
+ * Return >=0 for the number of sectors defragged.
  */
 int btrfs_defrag_file(struct inode *inode, struct file_ra_state *ra,
 		      struct btrfs_ioctl_defrag_range_args *range,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread