[PATCH v3 0/5] btrfs: defrag: don't waste CPU time on non-target extent

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v3 0/5] btrfs: defrag: don't waste CPU time on non-target extent
@ 2022-02-05  5:41 Qu Wenruo
  2022-02-05  5:41 ` [PATCH v3 1/5] btrfs: uapi: introduce BTRFS_DEFRAG_RANGE_MASK for later sanity check Qu Wenruo
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Qu Wenruo @ 2022-02-05  5:41 UTC (permalink / raw)
  To: linux-btrfs

In the rework of btrfs_defrag_file() one core idea is to defrag cluster
by cluster, thus we can have a better layered code structure, just like
what we have now:

btrfs_defrag_file()
|- defrag_one_cluster()
   |- defrag_one_range()
      |- defrag_one_locked_range()

But there is a catch, btrfs_defrag_file() just moves the cluster to the
next cluster, never considering cases like the current extent is already
too large, we can skip to its end directly.

This increases CPU usage on very large but not fragmented files.

Fix the behavior in defrag_one_cluster() that, defrag_collect_targets()
will reports where next search should start from.

If the current extent is not a target at all, then we can jump to the
end of that non-target extent to save time.

To get the missing optimization, also introduce a new structure,
btrfs_defrag_ctrl, so we don't need to pass things like @newer_than and
@max_to_defrag around.

This also remove weird behaviors like reusing range::start for next
search location.

And since we need to convert old btrfs_ioctl_defrag_range_args to newer
btrfs_defrag_ctrl, also do extra sanity check in the converting
function.

Such cleanup will also bring us closer to expose these extra policy
parameters in future enhanced defrag ioctl interface.
(Unfortunately, the reserved space of the existing defrag ioctl is not
large enough to contain them all)

Changelog:
v2:
- Rebased to lastest misc-next
  Just one small conflict with static_assert() update.
  And this time only those patches are rebased to misc-next, thus it may
  cause conflicts with fixes for defrag_check_next_extent() in the
  future.

- Several grammar fixes

- Report accurate btrfs_defrag_ctrl::sectors_defragged
  This is inspired by a comment from Filipe that the skip check
  should be done in the defrag_collect_targets() call inside
  defrag_one_range().

  This results a new patch in v2.

- Change the timing of btrfs_defrag_ctrl::last_scanned update
  Now it's updated inside defrag_one_range(), which will give
  us an accurate view, unlike the previous call site in
  defrag_one_cluster().

- Don't change the timing of extent threshold.

- Rename @last_target to @last_is_target in defrag_collect_targets()

v3:
- Add Reviewed-by tags

- Fix a wrong value in commit message of the 1st patch

- Make @orig_start const for the 3rd patch

- Fix a missing word "skip" in the 5th patch

- Remove one unnecessary assignment in the 5th patch
  As we don't return the defragged sectors to user space.

Qu Wenruo (5):
  btrfs: uapi: introduce BTRFS_DEFRAG_RANGE_MASK for later sanity check
  btrfs: defrag: introduce btrfs_defrag_ctrl structure for later usage
  btrfs: defrag: use btrfs_defrag_ctrl to replace
    btrfs_ioctl_defrag_range_args for btrfs_defrag_file()
  btrfs: defrag: make btrfs_defrag_file() to report accurate number of
    defragged sectors
  btrfs: defrag: allow defrag_one_cluster() to skip large extent which
    is not a target

 fs/btrfs/ctree.h           |  22 +++-
 fs/btrfs/file.c            |  17 ++-
 fs/btrfs/ioctl.c           | 219 ++++++++++++++++++++++---------------
 include/uapi/linux/btrfs.h |   6 +-
 4 files changed, 163 insertions(+), 101 deletions(-)

-- 
2.35.0

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 1/5] btrfs: uapi: introduce BTRFS_DEFRAG_RANGE_MASK for later sanity check
  2022-02-05  5:41 [PATCH v3 0/5] btrfs: defrag: don't waste CPU time on non-target extent Qu Wenruo
@ 2022-02-05  5:41 ` Qu Wenruo
  2022-02-05  5:41 ` [PATCH v3 2/5] btrfs: defrag: introduce btrfs_defrag_ctrl structure for later usage Qu Wenruo
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Qu Wenruo @ 2022-02-05  5:41 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Filipe Manana

And since we're here, replace the hardcoded bit flags (1, 2) with
(1UL << 0) and (1UL << 1), respectively.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
---
 include/uapi/linux/btrfs.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h
index 1cb1a3860f1d..57332a529c4e 100644
--- a/include/uapi/linux/btrfs.h
+++ b/include/uapi/linux/btrfs.h
@@ -576,8 +576,10 @@ struct btrfs_ioctl_clone_range_args {
  * Used by:
  * struct btrfs_ioctl_defrag_range_args.flags
  */
-#define BTRFS_DEFRAG_RANGE_COMPRESS 1
-#define BTRFS_DEFRAG_RANGE_START_IO 2
+#define BTRFS_DEFRAG_RANGE_COMPRESS	(1UL << 0)
+#define BTRFS_DEFRAG_RANGE_START_IO	(1UL << 1)
+#define BTRFS_DEFRAG_RANGE_FLAGS_MASK	(BTRFS_DEFRAG_RANGE_COMPRESS |\
+					 BTRFS_DEFRAG_RANGE_START_IO)
 struct btrfs_ioctl_defrag_range_args {
 	/* start of the defrag operation */
 	__u64 start;
-- 
2.35.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 2/5] btrfs: defrag: introduce btrfs_defrag_ctrl structure for later usage
  2022-02-05  5:41 [PATCH v3 0/5] btrfs: defrag: don't waste CPU time on non-target extent Qu Wenruo
  2022-02-05  5:41 ` [PATCH v3 1/5] btrfs: uapi: introduce BTRFS_DEFRAG_RANGE_MASK for later sanity check Qu Wenruo
@ 2022-02-05  5:41 ` Qu Wenruo
  2022-02-05  5:41 ` [PATCH v3 3/5] btrfs: defrag: use btrfs_defrag_ctrl to replace btrfs_ioctl_defrag_range_args for btrfs_defrag_file() Qu Wenruo
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Qu Wenruo @ 2022-02-05  5:41 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Filipe Manana

Currently btrfs_defrag_file() accepts not only
btrfs_ioctl_defrag_range_args but also other parameters like @newer_than
and @max_sectors_to_defrag for extra policies.

Those extra values are hidden from defrag ioctl and even caused bugs in
the past due to different behaviors based on those extra values.

Here we introduce a new structure, btrfs_defrag_ctrl, to include:

- all members in btrfs_ioctl_defrag_range_args

- @max_sectors_to_defrag and @newer_than

- Extra values which callers of btrfs_defrag_file() may care
  Like @sectors_defragged and @last_scanned.

With the new structure, also introduce a new helper,
btrfs_defrag_ioctl_args_to_ctrl() to:

- Do extra sanity check on @compress and @flags

- Do range alignment when possible

- Set default values.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/ctree.h | 19 +++++++++++++++++++
 fs/btrfs/ioctl.c | 41 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 60 insertions(+)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 9f7a950b8a69..670622fddd62 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3318,6 +3318,25 @@ int btrfs_fileattr_set(struct user_namespace *mnt_userns,
 int btrfs_ioctl_get_supported_features(void __user *arg);
 void btrfs_sync_inode_flags_to_i_flags(struct inode *inode);
 int __pure btrfs_is_empty_uuid(u8 *uuid);
+
+struct btrfs_defrag_ctrl {
+	/* Input, read-only fields */
+	u64	start;
+	u64	len;
+	u32	extent_thresh;
+	u64	newer_than;
+	u64	max_sectors_to_defrag;
+	u8	compress;
+	u8	flags;
+
+	/* Output fields */
+	u64	sectors_defragged;
+	u64	last_scanned;	/* Exclusive bytenr */
+};
+int btrfs_defrag_ioctl_args_to_ctrl(struct btrfs_fs_info *fs_info,
+				    struct btrfs_ioctl_defrag_range_args *args,
+				    struct btrfs_defrag_ctrl *ctrl,
+				    u64 max_sectors_to_defrag, u64 newer_than);
 int btrfs_defrag_file(struct inode *inode, struct file_ra_state *ra,
 		      struct btrfs_ioctl_defrag_range_args *range,
 		      u64 newer_than, unsigned long max_to_defrag);
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 001b0882dfd8..c265b19885db 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1509,6 +1509,47 @@ static int defrag_one_cluster(struct btrfs_inode *inode,
 	return ret;
 }
 
+/*
+ * Convert the old ioctl format to the new btrfs_defrag_ctrl structure.
+ *
+ * Will also do basic tasks like setting default values and sanity checks.
+ */
+int btrfs_defrag_ioctl_args_to_ctrl(struct btrfs_fs_info *fs_info,
+				    struct btrfs_ioctl_defrag_range_args *args,
+				    struct btrfs_defrag_ctrl *ctrl,
+				    u64 max_sectors_to_defrag, u64 newer_than)
+{
+	u64 range_end;
+
+	if (args->flags & ~BTRFS_DEFRAG_RANGE_FLAGS_MASK)
+		return -EOPNOTSUPP;
+	if (args->compress_type >= BTRFS_NR_COMPRESS_TYPES)
+		return -EOPNOTSUPP;
+
+	ctrl->start = round_down(args->start, fs_info->sectorsize);
+	/*
+	 * If @len does not overflow with @start nor is -1, align the length.
+	 * Otherwise set it to (u64)-1 so later btrfs_defrag_file() will
+	 * determine the length using isize.
+	 */
+	if (!check_add_overflow(args->start, args->len, &range_end) &&
+	    args->len != (u64)-1)
+		ctrl->len = round_up(range_end, fs_info->sectorsize) -
+			    ctrl->start;
+	else
+		ctrl->len = -1;
+	ctrl->flags = args->flags;
+	ctrl->compress = args->compress_type;
+	if (args->extent_thresh == 0)
+		ctrl->extent_thresh = SZ_256K;
+	else
+		ctrl->extent_thresh = args->extent_thresh;
+	ctrl->newer_than = newer_than;
+	ctrl->last_scanned = 0;
+	ctrl->sectors_defragged = 0;
+	return 0;
+}
+
 /*
  * Entry point to file defragmentation.
  *
-- 
2.35.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 3/5] btrfs: defrag: use btrfs_defrag_ctrl to replace btrfs_ioctl_defrag_range_args for btrfs_defrag_file()
  2022-02-05  5:41 [PATCH v3 0/5] btrfs: defrag: don't waste CPU time on non-target extent Qu Wenruo
  2022-02-05  5:41 ` [PATCH v3 1/5] btrfs: uapi: introduce BTRFS_DEFRAG_RANGE_MASK for later sanity check Qu Wenruo
  2022-02-05  5:41 ` [PATCH v3 2/5] btrfs: defrag: introduce btrfs_defrag_ctrl structure for later usage Qu Wenruo
@ 2022-02-05  5:41 ` Qu Wenruo
  2022-02-05  5:41 ` [PATCH v3 4/5] btrfs: defrag: make btrfs_defrag_file() to report accurate number of defragged sectors Qu Wenruo
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Qu Wenruo @ 2022-02-05  5:41 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Filipe Manana

This brings the following benefits:

- No more strange range->start update to indicate last scanned bytenr
  We have btrfs_defrag_ctrl::last_scanned (exclusive) for it directly.

- No more return value to indicate defragged sectors
  Now btrfs_defrag_file() will just return 0 if no error happened.
  And btrfs_defrag_ctrl::sectors_defragged will show that value.

- Less parameters to carry around
  Now most defrag_* functions only need to fetch its policy parameters
  from btrfs_defrag_ctrl directly.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/ctree.h |   3 +-
 fs/btrfs/file.c  |  17 +++---
 fs/btrfs/ioctl.c | 144 ++++++++++++++++++++---------------------------
 3 files changed, 71 insertions(+), 93 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 670622fddd62..ac222a9ce166 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3338,8 +3338,7 @@ int btrfs_defrag_ioctl_args_to_ctrl(struct btrfs_fs_info *fs_info,
 				    struct btrfs_defrag_ctrl *ctrl,
 				    u64 max_sectors_to_defrag, u64 newer_than);
 int btrfs_defrag_file(struct inode *inode, struct file_ra_state *ra,
-		      struct btrfs_ioctl_defrag_range_args *range,
-		      u64 newer_than, unsigned long max_to_defrag);
+		      struct btrfs_defrag_ctrl *ctrl);
 void btrfs_get_block_group_info(struct list_head *groups_list,
 				struct btrfs_ioctl_space_info *space);
 void btrfs_update_ioctl_balance_args(struct btrfs_fs_info *fs_info,
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 11204dbbe053..f5de8ab9787e 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -277,8 +277,7 @@ static int __btrfs_run_defrag_inode(struct btrfs_fs_info *fs_info,
 {
 	struct btrfs_root *inode_root;
 	struct inode *inode;
-	struct btrfs_ioctl_defrag_range_args range;
-	int num_defrag;
+	struct btrfs_defrag_ctrl ctrl = {};
 	int ret;
 
 	/* get the inode */
@@ -297,21 +296,21 @@ static int __btrfs_run_defrag_inode(struct btrfs_fs_info *fs_info,
 
 	/* do a chunk of defrag */
 	clear_bit(BTRFS_INODE_IN_DEFRAG, &BTRFS_I(inode)->runtime_flags);
-	memset(&range, 0, sizeof(range));
-	range.len = (u64)-1;
-	range.start = defrag->last_offset;
+	ctrl.len = (u64)-1;
+	ctrl.start = defrag->last_offset;
+	ctrl.newer_than = defrag->transid;
+	ctrl.max_sectors_to_defrag = BTRFS_DEFRAG_BATCH;
 
 	sb_start_write(fs_info->sb);
-	num_defrag = btrfs_defrag_file(inode, NULL, &range, defrag->transid,
-				       BTRFS_DEFRAG_BATCH);
+	ret = btrfs_defrag_file(inode, NULL, &ctrl);
 	sb_end_write(fs_info->sb);
 	/*
 	 * if we filled the whole defrag batch, there
 	 * must be more work to do.  Queue this defrag
 	 * again
 	 */
-	if (num_defrag == BTRFS_DEFRAG_BATCH) {
-		defrag->last_offset = range.start;
+	if (ctrl.sectors_defragged == BTRFS_DEFRAG_BATCH) {
+		defrag->last_offset = ctrl.last_scanned;
 		btrfs_requeue_inode_defrag(BTRFS_I(inode), defrag);
 	} else if (defrag->last_offset && !defrag->cycled) {
 		/*
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index c265b19885db..e100305d8bc2 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1177,22 +1177,18 @@ struct defrag_target_range {
 /*
  * Collect all valid target extents.
  *
+ * @ctrl:	   extra defrag policy control
  * @start:	   file offset to lookup
  * @len:	   length to lookup
- * @extent_thresh: file extent size threshold, any extent size >= this value
- *		   will be ignored
- * @newer_than:    only defrag extents newer than this value
- * @do_compress:   whether the defrag is doing compression
- *		   if true, @extent_thresh will be ignored and all regular
- *		   file extents meeting @newer_than will be targets.
  * @locked:	   if the range has already held extent lock
  * @target_list:   list of targets file extents
  */
 static int defrag_collect_targets(struct btrfs_inode *inode,
-				  u64 start, u64 len, u32 extent_thresh,
-				  u64 newer_than, bool do_compress,
-				  bool locked, struct list_head *target_list)
+				  const struct btrfs_defrag_ctrl *ctrl,
+				  u64 start, u32 len, bool locked,
+				  struct list_head *target_list)
 {
+	bool do_compress = ctrl->flags & BTRFS_DEFRAG_RANGE_COMPRESS;
 	u64 cur = start;
 	int ret = 0;
 
@@ -1212,7 +1208,7 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
 			goto next;
 
 		/* Skip older extent */
-		if (em->generation < newer_than)
+		if (em->generation < ctrl->newer_than)
 			goto next;
 
 		/*
@@ -1252,7 +1248,7 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
 			goto add;
 
 		/* Skip too large extent */
-		if (range_len >= extent_thresh)
+		if (range_len >= ctrl->extent_thresh)
 			goto next;
 
 		next_mergeable = defrag_check_next_extent(&inode->vfs_inode, em,
@@ -1374,8 +1370,9 @@ static int defrag_one_locked_target(struct btrfs_inode *inode,
 	return ret;
 }
 
-static int defrag_one_range(struct btrfs_inode *inode, u64 start, u32 len,
-			    u32 extent_thresh, u64 newer_than, bool do_compress)
+static int defrag_one_range(struct btrfs_inode *inode,
+			    const struct btrfs_defrag_ctrl *ctrl,
+			    u64 start, u32 len)
 {
 	struct extent_state *cached_state = NULL;
 	struct defrag_target_range *entry;
@@ -1419,8 +1416,7 @@ static int defrag_one_range(struct btrfs_inode *inode, u64 start, u32 len,
 	 * And this time we have extent locked already, pass @locked = true
 	 * so that we won't relock the extent range and cause deadlock.
 	 */
-	ret = defrag_collect_targets(inode, start, len, extent_thresh,
-				     newer_than, do_compress, true,
+	ret = defrag_collect_targets(inode, ctrl, start, len, true,
 				     &target_list);
 	if (ret < 0)
 		goto unlock_extent;
@@ -1451,12 +1447,16 @@ static int defrag_one_range(struct btrfs_inode *inode, u64 start, u32 len,
 	return ret;
 }
 
+/*
+ * Return <0 for error.
+ * Return >0 if we hit the ctrl->max_sectors_to_defrag limit
+ * Return 0 if we finished the range without error.
+ *
+ * For >= 0 case, ctrl->last_scanned and ctrl->sectors_defragged will be updated.
+ */
 static int defrag_one_cluster(struct btrfs_inode *inode,
 			      struct file_ra_state *ra,
-			      u64 start, u32 len, u32 extent_thresh,
-			      u64 newer_than, bool do_compress,
-			      unsigned long *sectors_defragged,
-			      unsigned long max_sectors)
+			      struct btrfs_defrag_ctrl *ctrl, u32 len)
 {
 	const u32 sectorsize = inode->root->fs_info->sectorsize;
 	struct defrag_target_range *entry;
@@ -1464,9 +1464,8 @@ static int defrag_one_cluster(struct btrfs_inode *inode,
 	LIST_HEAD(target_list);
 	int ret;
 
-	ret = defrag_collect_targets(inode, start, len, extent_thresh,
-				     newer_than, do_compress, false,
-				     &target_list);
+	ret = defrag_collect_targets(inode, ctrl, ctrl->last_scanned, len,
+				     false, &target_list);
 	if (ret < 0)
 		goto out;
 
@@ -1474,14 +1473,16 @@ static int defrag_one_cluster(struct btrfs_inode *inode,
 		u32 range_len = entry->len;
 
 		/* Reached or beyond the limit */
-		if (max_sectors && *sectors_defragged >= max_sectors) {
+		if (ctrl->max_sectors_to_defrag &&
+		    ctrl->sectors_defragged >= ctrl->max_sectors_to_defrag) {
 			ret = 1;
 			break;
 		}
 
-		if (max_sectors)
+		if (ctrl->max_sectors_to_defrag)
 			range_len = min_t(u32, range_len,
-				(max_sectors - *sectors_defragged) * sectorsize);
+					  (ctrl->max_sectors_to_defrag -
+					   ctrl->sectors_defragged) * sectorsize);
 
 		if (ra)
 			page_cache_sync_readahead(inode->vfs_inode.i_mapping,
@@ -1494,12 +1495,11 @@ static int defrag_one_cluster(struct btrfs_inode *inode,
 		 * But that's fine, it only affects the @sectors_defragged
 		 * accounting.
 		 */
-		ret = defrag_one_range(inode, entry->start, range_len,
-				       extent_thresh, newer_than, do_compress);
+		ret = defrag_one_range(inode, ctrl, entry->start, range_len);
 		if (ret < 0)
 			break;
-		*sectors_defragged += range_len >>
-				      inode->root->fs_info->sectorsize_bits;
+		ctrl->sectors_defragged += range_len >>
+					  inode->root->fs_info->sectorsize_bits;
 	}
 out:
 	list_for_each_entry_safe(entry, tmp, &target_list, list) {
@@ -1555,59 +1555,45 @@ int btrfs_defrag_ioctl_args_to_ctrl(struct btrfs_fs_info *fs_info,
  *
  * @inode:	   inode to be defragged
  * @ra:		   readahead state (can be NUL)
- * @range:	   defrag options including range and flags
- * @newer_than:	   minimum transid to defrag
- * @max_to_defrag: max number of sectors to be defragged, if 0, the whole inode
- *		   will be defragged.
+ * @ctrl:	   defrag options including range and various policy parameters
  *
  * Return <0 for error.
- * Return >=0 for the number of sectors defragged, and range->start will be updated
- * to indicate the file offset where next defrag should be started at.
- * (Mostly for autodefrag, which sets @max_to_defrag thus we may exit early without
- *  defragging all the range).
+ * Return 0 if the defrag is done without error, ctrl->last_scanned and
+ * ctrl->sectors_defragged will be updated.
  */
 int btrfs_defrag_file(struct inode *inode, struct file_ra_state *ra,
-		      struct btrfs_ioctl_defrag_range_args *range,
-		      u64 newer_than, unsigned long max_to_defrag)
+		      struct btrfs_defrag_ctrl *ctrl)
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
-	unsigned long sectors_defragged = 0;
 	u64 isize = i_size_read(inode);
-	u64 cur;
 	u64 last_byte;
-	bool do_compress = range->flags & BTRFS_DEFRAG_RANGE_COMPRESS;
+	bool do_compress = ctrl->flags & BTRFS_DEFRAG_RANGE_COMPRESS;
 	bool ra_allocated = false;
-	int compress_type = BTRFS_COMPRESS_ZLIB;
 	int ret = 0;
-	u32 extent_thresh = range->extent_thresh;
 	pgoff_t start_index;
 
 	if (isize == 0)
 		return 0;
 
-	if (range->start >= isize)
+	if (ctrl->start >= isize)
 		return -EINVAL;
 
-	if (do_compress) {
-		if (range->compress_type >= BTRFS_NR_COMPRESS_TYPES)
-			return -EINVAL;
-		if (range->compress_type)
-			compress_type = range->compress_type;
-	}
+	if (ctrl->extent_thresh == 0)
+		ctrl->extent_thresh = SZ_256K;
 
-	if (extent_thresh == 0)
-		extent_thresh = SZ_256K;
+	if (do_compress)
+		ASSERT(ctrl->compress < BTRFS_NR_COMPRESS_TYPES);
 
-	if (range->start + range->len > range->start) {
+	if (ctrl->start + ctrl->len > ctrl->start) {
 		/* Got a specific range */
-		last_byte = min(isize, range->start + range->len);
+		last_byte = min(isize, ctrl->start + ctrl->len);
 	} else {
 		/* Defrag until file end */
 		last_byte = isize;
 	}
 
-	/* Align the range */
-	cur = round_down(range->start, fs_info->sectorsize);
+	ASSERT(IS_ALIGNED(ctrl->start, fs_info->sectorsize));
+	ctrl->last_scanned = ctrl->start;
 	last_byte = round_up(last_byte, fs_info->sectorsize) - 1;
 
 	/*
@@ -1626,12 +1612,12 @@ int btrfs_defrag_file(struct inode *inode, struct file_ra_state *ra,
 	 * Make writeback start from the beginning of the range, so that the
 	 * defrag range can be written sequentially.
 	 */
-	start_index = cur >> PAGE_SHIFT;
+	start_index = ctrl->last_scanned >> PAGE_SHIFT;
 	if (start_index < inode->i_mapping->writeback_index)
 		inode->i_mapping->writeback_index = start_index;
 
-	while (cur < last_byte) {
-		const unsigned long prev_sectors_defragged = sectors_defragged;
+	while (ctrl->last_scanned < last_byte) {
+		const unsigned long prev_sectors_defragged = ctrl->sectors_defragged;
 		u64 cluster_end;
 
 		if (btrfs_defrag_cancelled(fs_info)) {
@@ -1640,7 +1626,7 @@ int btrfs_defrag_file(struct inode *inode, struct file_ra_state *ra,
 		}
 
 		/* We want the cluster end at page boundary when possible */
-		cluster_end = (((cur >> PAGE_SHIFT) +
+		cluster_end = (((ctrl->last_scanned >> PAGE_SHIFT) +
 			       (SZ_256K >> PAGE_SHIFT)) << PAGE_SHIFT) - 1;
 		cluster_end = min(cluster_end, last_byte);
 
@@ -1655,19 +1641,17 @@ int btrfs_defrag_file(struct inode *inode, struct file_ra_state *ra,
 			break;
 		}
 		if (do_compress)
-			BTRFS_I(inode)->defrag_compress = compress_type;
-		ret = defrag_one_cluster(BTRFS_I(inode), ra, cur,
-				cluster_end + 1 - cur, extent_thresh,
-				newer_than, do_compress,
-				&sectors_defragged, max_to_defrag);
+			BTRFS_I(inode)->defrag_compress = ctrl->compress;
+		ret = defrag_one_cluster(BTRFS_I(inode), ra, ctrl,
+				cluster_end + 1 - ctrl->last_scanned);
 
-		if (sectors_defragged > prev_sectors_defragged)
+		if (ctrl->sectors_defragged > prev_sectors_defragged)
 			balance_dirty_pages_ratelimited(inode->i_mapping);
 
 		btrfs_inode_unlock(inode, 0);
 		if (ret < 0)
 			break;
-		cur = cluster_end + 1;
+		ctrl->last_scanned = cluster_end + 1;
 		if (ret > 0) {
 			ret = 0;
 			break;
@@ -1677,27 +1661,21 @@ int btrfs_defrag_file(struct inode *inode, struct file_ra_state *ra,
 
 	if (ra_allocated)
 		kfree(ra);
-	/*
-	 * Update range.start for autodefrag, this will indicate where to start
-	 * in next run.
-	 */
-	range->start = cur;
-	if (sectors_defragged) {
+	if (ctrl->sectors_defragged) {
 		/*
 		 * We have defragged some sectors, for compression case they
 		 * need to be written back immediately.
 		 */
-		if (range->flags & BTRFS_DEFRAG_RANGE_START_IO) {
+		if (ctrl->flags & BTRFS_DEFRAG_RANGE_START_IO) {
 			filemap_flush(inode->i_mapping);
 			if (test_bit(BTRFS_INODE_HAS_ASYNC_EXTENT,
 				     &BTRFS_I(inode)->runtime_flags))
 				filemap_flush(inode->i_mapping);
 		}
-		if (range->compress_type == BTRFS_COMPRESS_LZO)
+		if (ctrl->compress == BTRFS_COMPRESS_LZO)
 			btrfs_set_fs_incompat(fs_info, COMPRESS_LZO);
-		else if (range->compress_type == BTRFS_COMPRESS_ZSTD)
+		else if (ctrl->compress == BTRFS_COMPRESS_ZSTD)
 			btrfs_set_fs_incompat(fs_info, COMPRESS_ZSTD);
-		ret = sectors_defragged;
 	}
 	if (do_compress) {
 		btrfs_inode_lock(inode, 0);
@@ -3212,6 +3190,7 @@ static int btrfs_ioctl_defrag(struct file *file, void __user *argp)
 	struct inode *inode = file_inode(file);
 	struct btrfs_root *root = BTRFS_I(inode)->root;
 	struct btrfs_ioctl_defrag_range_args range = {0};
+	struct btrfs_defrag_ctrl ctrl = {};
 	int ret;
 
 	ret = mnt_want_write_file(file);
@@ -3257,10 +3236,11 @@ static int btrfs_ioctl_defrag(struct file *file, void __user *argp)
 			/* the rest are all set to zero by kzalloc */
 			range.len = (u64)-1;
 		}
-		ret = btrfs_defrag_file(file_inode(file), &file->f_ra,
-					&range, BTRFS_OLDEST_GENERATION, 0);
-		if (ret > 0)
-			ret = 0;
+		ret = btrfs_defrag_ioctl_args_to_ctrl(root->fs_info, &range,
+						      &ctrl, 0, BTRFS_OLDEST_GENERATION);
+		if (ret < 0)
+			break;
+		ret = btrfs_defrag_file(file_inode(file), &file->f_ra, &ctrl);
 		break;
 	default:
 		ret = -EINVAL;
-- 
2.35.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 4/5] btrfs: defrag: make btrfs_defrag_file() to report accurate number of defragged sectors
  2022-02-05  5:41 [PATCH v3 0/5] btrfs: defrag: don't waste CPU time on non-target extent Qu Wenruo
                   ` (2 preceding siblings ...)
  2022-02-05  5:41 ` [PATCH v3 3/5] btrfs: defrag: use btrfs_defrag_ctrl to replace btrfs_ioctl_defrag_range_args for btrfs_defrag_file() Qu Wenruo
@ 2022-02-05  5:41 ` Qu Wenruo
  2022-02-05  5:41 ` [PATCH v3 5/5] btrfs: defrag: allow defrag_one_cluster() to skip large extent which is not a target Qu Wenruo
  2022-02-08 22:09 ` [PATCH v3 0/5] btrfs: defrag: don't waste CPU time on non-target extent David Sterba
  5 siblings, 0 replies; 13+ messages in thread
From: Qu Wenruo @ 2022-02-05  5:41 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Filipe Manana

Previously rework btrfs_defrag_file() can only report the number of
sectors from the first run of defrag_collect_targets().

This number is not accurate as if holes are punched after the first
defrag_collect_targets() call, we will not choose to defrag the holes.

Originally this is to avoid passing @sectors_defragged to every involved
functions.

But now since we have btrfs_defrag_ctrl, there is no need to do such
inaccurate accounting, just update btrfs_defrag_ctrl::sectors_defragged
after a successful defrag_one_locked_target() call.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/ioctl.c | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index e100305d8bc2..fb991a8f3929 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1371,7 +1371,7 @@ static int defrag_one_locked_target(struct btrfs_inode *inode,
 }
 
 static int defrag_one_range(struct btrfs_inode *inode,
-			    const struct btrfs_defrag_ctrl *ctrl,
+			    struct btrfs_defrag_ctrl *ctrl,
 			    u64 start, u32 len)
 {
 	struct extent_state *cached_state = NULL;
@@ -1426,6 +1426,8 @@ static int defrag_one_range(struct btrfs_inode *inode,
 					       &cached_state);
 		if (ret < 0)
 			break;
+		ctrl->sectors_defragged += entry->len >>
+					  inode->root->fs_info->sectorsize_bits;
 	}
 
 	list_for_each_entry_safe(entry, tmp, &target_list, list) {
@@ -1489,17 +1491,9 @@ static int defrag_one_cluster(struct btrfs_inode *inode,
 				ra, NULL, entry->start >> PAGE_SHIFT,
 				((entry->start + range_len - 1) >> PAGE_SHIFT) -
 				(entry->start >> PAGE_SHIFT) + 1);
-		/*
-		 * Here we may not defrag any range if holes are punched before
-		 * we locked the pages.
-		 * But that's fine, it only affects the @sectors_defragged
-		 * accounting.
-		 */
 		ret = defrag_one_range(inode, ctrl, entry->start, range_len);
 		if (ret < 0)
 			break;
-		ctrl->sectors_defragged += range_len >>
-					  inode->root->fs_info->sectorsize_bits;
 	}
 out:
 	list_for_each_entry_safe(entry, tmp, &target_list, list) {
-- 
2.35.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 5/5] btrfs: defrag: allow defrag_one_cluster() to skip large extent which is not a target
  2022-02-05  5:41 [PATCH v3 0/5] btrfs: defrag: don't waste CPU time on non-target extent Qu Wenruo
                   ` (3 preceding siblings ...)
  2022-02-05  5:41 ` [PATCH v3 4/5] btrfs: defrag: make btrfs_defrag_file() to report accurate number of defragged sectors Qu Wenruo
@ 2022-02-05  5:41 ` Qu Wenruo
  2022-02-08 22:09 ` [PATCH v3 0/5] btrfs: defrag: don't waste CPU time on non-target extent David Sterba
  5 siblings, 0 replies; 13+ messages in thread
From: Qu Wenruo @ 2022-02-05  5:41 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Filipe Manana

In the rework of btrfs_defrag_file(), we always call
defrag_one_cluster() and increase the offset by cluster size, which is
only 256K.

But there are cases where we have a large extent (e.g. 128M) which
doesn't need to be defragged at all.

Before the refactor, we can directly skip the range, but now we have to
scan that extent map again and again until the cluster moves after the
non-target extent.

Fix the problem by allow defrag_one_cluster() to increase
btrfs_defrag_ctrl::last_scanned to the end of an extent, if and only if
the last extent of the cluster is not a target.

The test script looks like this:

	mkfs.btrfs -f $dev > /dev/null

	mount $dev $mnt

	# As btrfs ioctl uses 32M as extent_threshold
	xfs_io -f -c "pwrite 0 64M" $mnt/file1
	sync
	# Some fragemented range to defrag
	xfs_io -s -c "pwrite 65548k 4k" \
		  -c "pwrite 65544k 4k" \
		  -c "pwrite 65540k 4k" \
		  -c "pwrite 65536k 4k" \
		  $mnt/file1
	sync

	echo "=== before ==="
	xfs_io -c "fiemap -v" $mnt/file1
	echo "=== after ==="
	btrfs fi defrag $mnt/file1
	sync
	xfs_io -c "fiemap -v" $mnt/file1
	umount $mnt

With extra ftrace put into defrag_one_cluster(), before the patch it
would result tons of loops:

(As defrag_one_cluster() is inlined, the function name is its caller)

  btrfs-126062  [005] .....  4682.816026: btrfs_defrag_file: r/i=5/257 start=0 len=262144
  btrfs-126062  [005] .....  4682.816027: btrfs_defrag_file: r/i=5/257 start=262144 len=262144
  btrfs-126062  [005] .....  4682.816028: btrfs_defrag_file: r/i=5/257 start=524288 len=262144
  btrfs-126062  [005] .....  4682.816028: btrfs_defrag_file: r/i=5/257 start=786432 len=262144
  btrfs-126062  [005] .....  4682.816028: btrfs_defrag_file: r/i=5/257 start=1048576 len=262144
  ...
  btrfs-126062  [005] .....  4682.816043: btrfs_defrag_file: r/i=5/257 start=67108864 len=262144

But with this patch there will be just one loop, then directly to the
end of the extent:

  btrfs-130471  [014] .....  5434.029558: defrag_one_cluster: r/i=5/257 start=0 len=262144
  btrfs-130471  [014] .....  5434.029559: defrag_one_cluster: r/i=5/257 start=67108864 len=16384

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/ioctl.c | 40 ++++++++++++++++++++++++++++++++++------
 1 file changed, 34 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index fb991a8f3929..ec68be312ff7 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1185,10 +1185,11 @@ struct defrag_target_range {
  */
 static int defrag_collect_targets(struct btrfs_inode *inode,
 				  const struct btrfs_defrag_ctrl *ctrl,
-				  u64 start, u32 len, bool locked,
-				  struct list_head *target_list)
+				  u64 start, u32 len, u64 *last_scanned_ret,
+				  bool locked, struct list_head *target_list)
 {
 	bool do_compress = ctrl->flags & BTRFS_DEFRAG_RANGE_COMPRESS;
+	bool last_is_target = false;
 	u64 cur = start;
 	int ret = 0;
 
@@ -1198,6 +1199,7 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
 		bool next_mergeable = true;
 		u64 range_len;
 
+		last_is_target = false;
 		em = defrag_lookup_extent(&inode->vfs_inode, cur, locked);
 		if (!em)
 			break;
@@ -1269,6 +1271,7 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
 		}
 
 add:
+		last_is_target = true;
 		range_len = min(extent_map_end(em), start + len) - cur;
 		/*
 		 * This one is a good target, check if it can be merged into
@@ -1312,6 +1315,17 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
 			kfree(entry);
 		}
 	}
+	if (!ret && last_scanned_ret) {
+		/*
+		 * If the last extent is not a target, the caller can skip to
+		 * the end of that extent.
+		 * Otherwise, we can only go the end of the spcified range.
+		 */
+		if (!last_is_target)
+			*last_scanned_ret = cur;
+		else
+			*last_scanned_ret = start + len;
+	}
 	return ret;
 }
 
@@ -1382,6 +1396,7 @@ static int defrag_one_range(struct btrfs_inode *inode,
 	const u32 sectorsize = inode->root->fs_info->sectorsize;
 	u64 last_index = (start + len - 1) >> PAGE_SHIFT;
 	u64 start_index = start >> PAGE_SHIFT;
+	u64 last_scanned;
 	unsigned int nr_pages = last_index - start_index + 1;
 	int ret = 0;
 	int i;
@@ -1416,8 +1431,8 @@ static int defrag_one_range(struct btrfs_inode *inode,
 	 * And this time we have extent locked already, pass @locked = true
 	 * so that we won't relock the extent range and cause deadlock.
 	 */
-	ret = defrag_collect_targets(inode, ctrl, start, len, true,
-				     &target_list);
+	ret = defrag_collect_targets(inode, ctrl, start, len, &last_scanned,
+				     true, &target_list);
 	if (ret < 0)
 		goto unlock_extent;
 
@@ -1434,6 +1449,8 @@ static int defrag_one_range(struct btrfs_inode *inode,
 		list_del_init(&entry->list);
 		kfree(entry);
 	}
+	if (!ret)
+		ctrl->last_scanned = last_scanned;
 unlock_extent:
 	unlock_extent_cached(&inode->io_tree, start_index << PAGE_SHIFT,
 			     (last_index << PAGE_SHIFT) + PAGE_SIZE - 1,
@@ -1461,13 +1478,14 @@ static int defrag_one_cluster(struct btrfs_inode *inode,
 			      struct btrfs_defrag_ctrl *ctrl, u32 len)
 {
 	const u32 sectorsize = inode->root->fs_info->sectorsize;
+	const u64 orig_start = ctrl->last_scanned;
 	struct defrag_target_range *entry;
 	struct defrag_target_range *tmp;
 	LIST_HEAD(target_list);
 	int ret;
 
 	ret = defrag_collect_targets(inode, ctrl, ctrl->last_scanned, len,
-				     false, &target_list);
+				     NULL, false, &target_list);
 	if (ret < 0)
 		goto out;
 
@@ -1486,6 +1504,15 @@ static int defrag_one_cluster(struct btrfs_inode *inode,
 					  (ctrl->max_sectors_to_defrag -
 					   ctrl->sectors_defragged) * sectorsize);
 
+		/*
+		 * If defrag_one_range() has updated ctrl::last_scanned,
+		 * our range may already be invalid (e.g. hole punched).
+		 * Skip if our range is before ctrl::last_scanned, as there is
+		 * no need to defrag the range anymore.
+		 */
+		if (entry->start + range_len <= ctrl->last_scanned)
+			continue;
+
 		if (ra)
 			page_cache_sync_readahead(inode->vfs_inode.i_mapping,
 				ra, NULL, entry->start >> PAGE_SHIFT,
@@ -1500,6 +1527,8 @@ static int defrag_one_cluster(struct btrfs_inode *inode,
 		list_del_init(&entry->list);
 		kfree(entry);
 	}
+	if (ret >= 0)
+		ctrl->last_scanned = max(ctrl->last_scanned, orig_start + len);
 	return ret;
 }
 
@@ -1645,7 +1674,6 @@ int btrfs_defrag_file(struct inode *inode, struct file_ra_state *ra,
 		btrfs_inode_unlock(inode, 0);
 		if (ret < 0)
 			break;
-		ctrl->last_scanned = cluster_end + 1;
 		if (ret > 0) {
 			ret = 0;
 			break;
-- 
2.35.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/5] btrfs: defrag: don't waste CPU time on non-target extent
  2022-02-05  5:41 [PATCH v3 0/5] btrfs: defrag: don't waste CPU time on non-target extent Qu Wenruo
                   ` (4 preceding siblings ...)
  2022-02-05  5:41 ` [PATCH v3 5/5] btrfs: defrag: allow defrag_one_cluster() to skip large extent which is not a target Qu Wenruo
@ 2022-02-08 22:09 ` David Sterba
  2022-02-09  0:17   ` Qu Wenruo
  5 siblings, 1 reply; 13+ messages in thread
From: David Sterba @ 2022-02-08 22:09 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Sat, Feb 05, 2022 at 01:41:01PM +0800, Qu Wenruo wrote:
> In the rework of btrfs_defrag_file() one core idea is to defrag cluster
> by cluster, thus we can have a better layered code structure, just like
> what we have now:
> 
> btrfs_defrag_file()
> |- defrag_one_cluster()
>    |- defrag_one_range()
>       |- defrag_one_locked_range()
> 
> But there is a catch, btrfs_defrag_file() just moves the cluster to the
> next cluster, never considering cases like the current extent is already
> too large, we can skip to its end directly.
> 
> This increases CPU usage on very large but not fragmented files.
> 
> Fix the behavior in defrag_one_cluster() that, defrag_collect_targets()
> will reports where next search should start from.
> 
> If the current extent is not a target at all, then we can jump to the
> end of that non-target extent to save time.
> 
> To get the missing optimization, also introduce a new structure,
> btrfs_defrag_ctrl, so we don't need to pass things like @newer_than and
> @max_to_defrag around.

Is this patchset supposed to be in 5.16 as fix for some defrag problem?
If yes, the patch switching to the control structure should be avoided
and done as a post cleanup as some other patches depend on it.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/5] btrfs: defrag: don't waste CPU time on non-target extent
  2022-02-08 22:09 ` [PATCH v3 0/5] btrfs: defrag: don't waste CPU time on non-target extent David Sterba
@ 2022-02-09  0:17   ` Qu Wenruo
  2022-02-09 15:19     ` David Sterba
  0 siblings, 1 reply; 13+ messages in thread
From: Qu Wenruo @ 2022-02-09  0:17 UTC (permalink / raw)
  To: dsterba, Qu Wenruo, linux-btrfs



On 2022/2/9 06:09, David Sterba wrote:
> On Sat, Feb 05, 2022 at 01:41:01PM +0800, Qu Wenruo wrote:
>> In the rework of btrfs_defrag_file() one core idea is to defrag cluster
>> by cluster, thus we can have a better layered code structure, just like
>> what we have now:
>>
>> btrfs_defrag_file()
>> |- defrag_one_cluster()
>>     |- defrag_one_range()
>>        |- defrag_one_locked_range()
>>
>> But there is a catch, btrfs_defrag_file() just moves the cluster to the
>> next cluster, never considering cases like the current extent is already
>> too large, we can skip to its end directly.
>>
>> This increases CPU usage on very large but not fragmented files.
>>
>> Fix the behavior in defrag_one_cluster() that, defrag_collect_targets()
>> will reports where next search should start from.
>>
>> If the current extent is not a target at all, then we can jump to the
>> end of that non-target extent to save time.
>>
>> To get the missing optimization, also introduce a new structure,
>> btrfs_defrag_ctrl, so we don't need to pass things like @newer_than and
>> @max_to_defrag around.
>
> Is this patchset supposed to be in 5.16 as fix for some defrag problem?
> If yes, the patch switching to the control structure should be avoided
> and done as a post cleanup as some other patches depend on it.

I can backport it manually to v5.16 without the ctrl refactor.

The problem of the current situation is not really ideal, thus I put the
refactor first.

Thanks,
Qu

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/5] btrfs: defrag: don't waste CPU time on non-target extent
  2022-02-09  0:17   ` Qu Wenruo
@ 2022-02-09 15:19     ` David Sterba
  2022-02-10  0:33       ` Qu Wenruo
  0 siblings, 1 reply; 13+ messages in thread
From: David Sterba @ 2022-02-09 15:19 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: dsterba, Qu Wenruo, linux-btrfs

On Wed, Feb 09, 2022 at 08:17:27AM +0800, Qu Wenruo wrote:
> 
> 
> On 2022/2/9 06:09, David Sterba wrote:
> > On Sat, Feb 05, 2022 at 01:41:01PM +0800, Qu Wenruo wrote:
> >> In the rework of btrfs_defrag_file() one core idea is to defrag cluster
> >> by cluster, thus we can have a better layered code structure, just like
> >> what we have now:
> >>
> >> btrfs_defrag_file()
> >> |- defrag_one_cluster()
> >>     |- defrag_one_range()
> >>        |- defrag_one_locked_range()
> >>
> >> But there is a catch, btrfs_defrag_file() just moves the cluster to the
> >> next cluster, never considering cases like the current extent is already
> >> too large, we can skip to its end directly.
> >>
> >> This increases CPU usage on very large but not fragmented files.
> >>
> >> Fix the behavior in defrag_one_cluster() that, defrag_collect_targets()
> >> will reports where next search should start from.
> >>
> >> If the current extent is not a target at all, then we can jump to the
> >> end of that non-target extent to save time.
> >>
> >> To get the missing optimization, also introduce a new structure,
> >> btrfs_defrag_ctrl, so we don't need to pass things like @newer_than and
> >> @max_to_defrag around.
> >
> > Is this patchset supposed to be in 5.16 as fix for some defrag problem?
> > If yes, the patch switching to the control structure should be avoided
> > and done as a post cleanup as some other patches depend on it.
> 
> I can backport it manually to v5.16 without the ctrl refactor.

That's not great if we have to do two versions, and more fixes to defrag
are still expected so a cleanup patch makes any backporting harder. That
we can send a manually ported version to stable is there for cases when
the changes are not possible. In this case the cleanup is not necessary,
but I understand it makes the code cleaner. Given that we need to fix a
released kernel it's the preference.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/5] btrfs: defrag: don't waste CPU time on non-target extent
  2022-02-09 15:19     ` David Sterba
@ 2022-02-10  0:33       ` Qu Wenruo
  2022-02-10 14:26         ` David Sterba
  0 siblings, 1 reply; 13+ messages in thread
From: Qu Wenruo @ 2022-02-10  0:33 UTC (permalink / raw)
  To: dsterba, Qu Wenruo, linux-btrfs



On 2022/2/9 23:19, David Sterba wrote:
> On Wed, Feb 09, 2022 at 08:17:27AM +0800, Qu Wenruo wrote:
>>
>>
>> On 2022/2/9 06:09, David Sterba wrote:
>>> On Sat, Feb 05, 2022 at 01:41:01PM +0800, Qu Wenruo wrote:
>>>> In the rework of btrfs_defrag_file() one core idea is to defrag cluster
>>>> by cluster, thus we can have a better layered code structure, just like
>>>> what we have now:
>>>>
>>>> btrfs_defrag_file()
>>>> |- defrag_one_cluster()
>>>>      |- defrag_one_range()
>>>>         |- defrag_one_locked_range()
>>>>
>>>> But there is a catch, btrfs_defrag_file() just moves the cluster to the
>>>> next cluster, never considering cases like the current extent is already
>>>> too large, we can skip to its end directly.
>>>>
>>>> This increases CPU usage on very large but not fragmented files.
>>>>
>>>> Fix the behavior in defrag_one_cluster() that, defrag_collect_targets()
>>>> will reports where next search should start from.
>>>>
>>>> If the current extent is not a target at all, then we can jump to the
>>>> end of that non-target extent to save time.
>>>>
>>>> To get the missing optimization, also introduce a new structure,
>>>> btrfs_defrag_ctrl, so we don't need to pass things like @newer_than and
>>>> @max_to_defrag around.
>>>
>>> Is this patchset supposed to be in 5.16 as fix for some defrag problem?
>>> If yes, the patch switching to the control structure should be avoided
>>> and done as a post cleanup as some other patches depend on it.
>>
>> I can backport it manually to v5.16 without the ctrl refactor.
> 
> That's not great if we have to do two versions, and more fixes to defrag
> are still expected so a cleanup patch makes any backporting harder. That
> we can send a manually ported version to stable is there for cases when
> the changes are not possible. In this case the cleanup is not necessary,
> but I understand it makes the code cleaner. Given that we need to fix a
> released kernel it's the preference.
> 
Unfortunately, it looks like without this cleanup, all later patches 
either needs extra parameters passing down the call chain, or have very 
"creative" way to pass info around, and can causing more problem in the 
long run.

I'm wondering if it's some policy in the stable tree preventing such 
cleanup patches being merged or something else?

Thanks,
Qu


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/5] btrfs: defrag: don't waste CPU time on non-target extent
  2022-02-10  0:33       ` Qu Wenruo
@ 2022-02-10 14:26         ` David Sterba
  2022-02-10 16:52           ` David Sterba
  0 siblings, 1 reply; 13+ messages in thread
From: David Sterba @ 2022-02-10 14:26 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: dsterba, Qu Wenruo, linux-btrfs

On Thu, Feb 10, 2022 at 08:33:04AM +0800, Qu Wenruo wrote:
> On 2022/2/9 23:19, David Sterba wrote:
> > On Wed, Feb 09, 2022 at 08:17:27AM +0800, Qu Wenruo wrote:
> >> On 2022/2/9 06:09, David Sterba wrote:
> >>> On Sat, Feb 05, 2022 at 01:41:01PM +0800, Qu Wenruo wrote:
> >>>> To get the missing optimization, also introduce a new structure,
> >>>> btrfs_defrag_ctrl, so we don't need to pass things like @newer_than and
> >>>> @max_to_defrag around.
> >>>
> >>> Is this patchset supposed to be in 5.16 as fix for some defrag problem?
> >>> If yes, the patch switching to the control structure should be avoided
> >>> and done as a post cleanup as some other patches depend on it.
> >>
> >> I can backport it manually to v5.16 without the ctrl refactor.
> > 
> > That's not great if we have to do two versions, and more fixes to defrag
> > are still expected so a cleanup patch makes any backporting harder. That
> > we can send a manually ported version to stable is there for cases when
> > the changes are not possible. In this case the cleanup is not necessary,
> > but I understand it makes the code cleaner. Given that we need to fix a
> > released kernel it's the preference.
> > 
> Unfortunately, it looks like without this cleanup, all later patches 
> either needs extra parameters passing down the call chain, or have very 
> "creative" way to pass info around, and can causing more problem in the 
> long run.
> 
> I'm wondering if it's some policy in the stable tree preventing such 
> cleanup patches being merged or something else?

It's not a hard policy, I don't remember a patch rejected from stable
because it needed a prep work. The fix itself must be justified, and
minimal if possible and that's what I'm always trying to go for.

This patches has diffstat

4 files changed, 163 insertions(+), 101 deletions(-)

and the preparatory patch itself

3 files changed, 71 insertions(+), 93 deletions(-)

Patch 4 probably needs to add one parameter, and 5 is adding a new
one even though there's the ctrl structure. So I think it would not be
that intrusive with a few extra parameters compared to the whole ctrl
conversion.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/5] btrfs: defrag: don't waste CPU time on non-target extent
  2022-02-10 14:26         ` David Sterba
@ 2022-02-10 16:52           ` David Sterba
  2022-02-11  0:42             ` Qu Wenruo
  0 siblings, 1 reply; 13+ messages in thread
From: David Sterba @ 2022-02-10 16:52 UTC (permalink / raw)
  To: dsterba, Qu Wenruo, Qu Wenruo, linux-btrfs

On Thu, Feb 10, 2022 at 03:26:52PM +0100, David Sterba wrote:
> > Unfortunately, it looks like without this cleanup, all later patches 
> > either needs extra parameters passing down the call chain, or have very 
> > "creative" way to pass info around, and can causing more problem in the 
> > long run.
> > 
> > I'm wondering if it's some policy in the stable tree preventing such 
> > cleanup patches being merged or something else?
> 
> It's not a hard policy, I don't remember a patch rejected from stable
> because it needed a prep work. The fix itself must be justified, and
> minimal if possible and that's what I'm always trying to go for.
> 
> This patches has diffstat
> 
> 4 files changed, 163 insertions(+), 101 deletions(-)
> 
> and the preparatory patch itself
> 
> 3 files changed, 71 insertions(+), 93 deletions(-)
> 
> Patch 4 probably needs to add one parameter, and 5 is adding a new
> one even though there's the ctrl structure. So I think it would not be
> that intrusive with a few extra parameters compared to the whole ctrl
> conversion.

I've tried to minimize that, basically it's just patch 5, passing around
the last_scanned, but I'm not 100% sure, the ctrl structure makes it
incomprehensible what else needs to be passed around or initialized.
If we're fixing a not so great rewrite I'm very inclined to fix it by a
series of minimal fixes, not another rewrite.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/5] btrfs: defrag: don't waste CPU time on non-target extent
  2022-02-10 16:52           ` David Sterba
@ 2022-02-11  0:42             ` Qu Wenruo
  0 siblings, 0 replies; 13+ messages in thread
From: Qu Wenruo @ 2022-02-11  0:42 UTC (permalink / raw)
  To: dsterba, Qu Wenruo, linux-btrfs



On 2022/2/11 00:52, David Sterba wrote:
> On Thu, Feb 10, 2022 at 03:26:52PM +0100, David Sterba wrote:
>>> Unfortunately, it looks like without this cleanup, all later patches
>>> either needs extra parameters passing down the call chain, or have very
>>> "creative" way to pass info around, and can causing more problem in the
>>> long run.
>>>
>>> I'm wondering if it's some policy in the stable tree preventing such
>>> cleanup patches being merged or something else?
>>
>> It's not a hard policy, I don't remember a patch rejected from stable
>> because it needed a prep work. The fix itself must be justified, and
>> minimal if possible and that's what I'm always trying to go for.
>>
>> This patches has diffstat
>>
>> 4 files changed, 163 insertions(+), 101 deletions(-)
>>
>> and the preparatory patch itself
>>
>> 3 files changed, 71 insertions(+), 93 deletions(-)
>>
>> Patch 4 probably needs to add one parameter, and 5 is adding a new
>> one even though there's the ctrl structure. So I think it would not be
>> that intrusive with a few extra parameters compared to the whole ctrl
>> conversion.
>
> I've tried to minimize that, basically it's just patch 5, passing around
> the last_scanned, but I'm not 100% sure, the ctrl structure makes it
> incomprehensible what else needs to be passed around or initialized.
> If we're fixing a not so great rewrite I'm very inclined to fix it by a
> series of minimal fixes, not another rewrite.

OK, I'll re-order the patchset to put the fix in front of the structure
rework.

Thanks,
Qu

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-02-11  0:42 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-05  5:41 [PATCH v3 0/5] btrfs: defrag: don't waste CPU time on non-target extent Qu Wenruo
2022-02-05  5:41 ` [PATCH v3 1/5] btrfs: uapi: introduce BTRFS_DEFRAG_RANGE_MASK for later sanity check Qu Wenruo
2022-02-05  5:41 ` [PATCH v3 2/5] btrfs: defrag: introduce btrfs_defrag_ctrl structure for later usage Qu Wenruo
2022-02-05  5:41 ` [PATCH v3 3/5] btrfs: defrag: use btrfs_defrag_ctrl to replace btrfs_ioctl_defrag_range_args for btrfs_defrag_file() Qu Wenruo
2022-02-05  5:41 ` [PATCH v3 4/5] btrfs: defrag: make btrfs_defrag_file() to report accurate number of defragged sectors Qu Wenruo
2022-02-05  5:41 ` [PATCH v3 5/5] btrfs: defrag: allow defrag_one_cluster() to skip large extent which is not a target Qu Wenruo
2022-02-08 22:09 ` [PATCH v3 0/5] btrfs: defrag: don't waste CPU time on non-target extent David Sterba
2022-02-09  0:17   ` Qu Wenruo
2022-02-09 15:19     ` David Sterba
2022-02-10  0:33       ` Qu Wenruo
2022-02-10 14:26         ` David Sterba
2022-02-10 16:52           ` David Sterba
2022-02-11  0:42             ` Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).