All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RESEND 1/2] btrfs: Enhance btrfs_trim_fs function to handle error better
@ 2018-04-04  6:15 Qu Wenruo
  2018-04-04  6:15 ` [PATCH RESEND 2/2] btrfs: Ensure btrfs_trim_fs can trim the whole fs Qu Wenruo
  0 siblings, 1 reply; 4+ messages in thread
From: Qu Wenruo @ 2018-04-04  6:15 UTC (permalink / raw)
  To: linux-btrfs

Function btrfs_trim_fs() doesn't handle errors in a consistent way, if
error happens when trimming existing block groups, it will skip the
remaining blocks and continue to trim unallocated space for each device.

And the return value will only reflect the final error from device
trimming.

This patch will fix such behavior by:

1) Recording first error from block group or device trimming
   So return value will also reflect any error found when trimming.
   Make developer more aware of the problem.

2) Outputting btrfs warning message for each trimming failure
   Any error for block group or device trimming will cause btrfs warning
   kernel message.

3) Continuing trimming if we can
   If we failed to trim one block group or device, we could still try
   next block group or device.

Such behavior can avoid confusion for case like failure to trim the
first block group and then only unallocated space is trimmed.

Reported-by: Chris Murphy <lists@colorremedies.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/extent-tree.c | 59 ++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 43 insertions(+), 16 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index d83d449e749a..f3b088665b7a 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -10975,6 +10975,16 @@ static int btrfs_trim_free_extents(struct btrfs_device *device,
 	return ret;
 }
 
+/*
+ * Trim the whole fs, by:
+ * 1) Trimming free space in each block group
+ * 2) Trimming unallocated space in each device
+ *
+ * Will try to continue trimming even if we failed to trim one block group or
+ * device.
+ * The return value will be the error return value of the first error.
+ * Or 0 if nothing wrong happened.
+ */
 int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 {
 	struct btrfs_block_group_cache *cache = NULL;
@@ -10985,6 +10995,8 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 	u64 end;
 	u64 trimmed = 0;
 	u64 total_bytes = btrfs_super_total_bytes(fs_info->super_copy);
+	int bg_ret = 0;
+	int dev_ret = 0;
 	int ret = 0;
 
 	/*
@@ -10995,7 +11007,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 	else
 		cache = btrfs_lookup_block_group(fs_info, range->start);
 
-	while (cache) {
+	for (; cache; cache = next_block_group(fs_info, cache)) {
 		if (cache->key.objectid >= (range->start + range->len)) {
 			btrfs_put_block_group(cache);
 			break;
@@ -11009,29 +11021,36 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 			if (!block_group_cache_done(cache)) {
 				ret = cache_block_group(cache, 0);
 				if (ret) {
-					btrfs_put_block_group(cache);
-					break;
+					btrfs_warn_rl(fs_info,
+		"failed to cache block group %llu ret %d",
+						   cache->key.objectid, ret);
+					if (!bg_ret)
+						bg_ret = ret;
+					continue;
 				}
 				ret = wait_block_group_cache_done(cache);
 				if (ret) {
-					btrfs_put_block_group(cache);
-					break;
+					btrfs_warn_rl(fs_info,
+		"failed to wait cache for block group %llu ret %d",
+						   cache->key.objectid, ret);
+					if (!bg_ret)
+						bg_ret = ret;
+					continue;
 				}
 			}
-			ret = btrfs_trim_block_group(cache,
-						     &group_trimmed,
-						     start,
-						     end,
-						     range->minlen);
+			ret = btrfs_trim_block_group(cache, &group_trimmed,
+						start, end, range->minlen);
 
 			trimmed += group_trimmed;
 			if (ret) {
-				btrfs_put_block_group(cache);
-				break;
+				btrfs_warn_rl(fs_info,
+		"failed to trim block group %llu ret %d",
+					   cache->key.objectid, ret);
+				if (!bg_ret)
+					bg_ret = ret;
+				continue;
 			}
 		}
-
-		cache = next_block_group(fs_info, cache);
 	}
 
 	mutex_lock(&fs_info->fs_devices->device_list_mutex);
@@ -11039,15 +11058,23 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 	list_for_each_entry(device, devices, dev_alloc_list) {
 		ret = btrfs_trim_free_extents(device, range->minlen,
 					      &group_trimmed);
-		if (ret)
+		if (ret) {
+			btrfs_warn_rl(fs_info,
+		"failed to trim unallocated space for devid %llu ret %d",
+				      device->devid, ret);
+			if (!dev_ret)
+				dev_ret = ret;
 			break;
+		}
 
 		trimmed += group_trimmed;
 	}
 	mutex_unlock(&fs_info->fs_devices->device_list_mutex);
 
 	range->len = trimmed;
-	return ret;
+	if (bg_ret)
+		return bg_ret;
+	return dev_ret;
 }
 
 /*
-- 
2.16.3


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH RESEND 2/2] btrfs: Ensure btrfs_trim_fs can trim the whole fs
  2018-04-04  6:15 [PATCH RESEND 1/2] btrfs: Enhance btrfs_trim_fs function to handle error better Qu Wenruo
@ 2018-04-04  6:15 ` Qu Wenruo
  0 siblings, 0 replies; 4+ messages in thread
From: Qu Wenruo @ 2018-04-04  6:15 UTC (permalink / raw)
  To: linux-btrfs; +Cc: stable

[BUG]
fstrim on some btrfs only trims the unallocated space, not trimming any
space in existing block groups.

[CAUSE]
Before fstrim_range passed to btrfs_trim_fs(), it get truncated to
range [0, super->total_bytes).
So later btrfs_trim_fs() will only be able to trim block groups in range
[0, super->total_bytes).

While for btrfs, any bytenr aligned to sector size is valid, since btrfs use
its logical address space, there is nothing limiting the location where
we put block groups.

For btrfs with routine balance, it's quite easy to relocate all
block groups and bytenr of block groups will start beyond super->total_bytes.

In that case, btrfs will not trim existing block groups.

[FIX]
Just remove the truncation in btrfs_ioctl_fitrim(), so btrfs_trim_fs()
can get the unmodified range, which is normally set to [0, U64_MAX].

Reported-by: Chris Murphy <lists@colorremedies.com>
Fixes: f4c697e6406d ("btrfs: return EINVAL if start > total_bytes in fitrim ioctl")
Cc: <stable@vger.kernel.org> # v4.0+
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/extent-tree.c | 10 +---------
 fs/btrfs/ioctl.c       | 11 +++++++----
 2 files changed, 8 insertions(+), 13 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index f3b088665b7a..647691fc16e8 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -10994,19 +10994,11 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 	u64 start;
 	u64 end;
 	u64 trimmed = 0;
-	u64 total_bytes = btrfs_super_total_bytes(fs_info->super_copy);
 	int bg_ret = 0;
 	int dev_ret = 0;
 	int ret = 0;
 
-	/*
-	 * try to trim all FS space, our block group may start from non-zero.
-	 */
-	if (range->len == total_bytes)
-		cache = btrfs_lookup_first_block_group(fs_info, range->start);
-	else
-		cache = btrfs_lookup_block_group(fs_info, range->start);
-
+	cache = btrfs_lookup_first_block_group(fs_info, range->start);
 	for (; cache; cache = next_block_group(fs_info, cache)) {
 		if (cache->key.objectid >= (range->start + range->len)) {
 			btrfs_put_block_group(cache);
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index ac85e07f567b..761fba8d8f75 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -364,7 +364,6 @@ static noinline int btrfs_ioctl_fitrim(struct file *file, void __user *arg)
 	struct fstrim_range range;
 	u64 minlen = ULLONG_MAX;
 	u64 num_devices = 0;
-	u64 total_bytes = btrfs_super_total_bytes(fs_info->super_copy);
 	int ret;
 
 	if (!capable(CAP_SYS_ADMIN))
@@ -388,11 +387,15 @@ static noinline int btrfs_ioctl_fitrim(struct file *file, void __user *arg)
 		return -EOPNOTSUPP;
 	if (copy_from_user(&range, arg, sizeof(range)))
 		return -EFAULT;
-	if (range.start > total_bytes ||
-	    range.len < fs_info->sb->s_blocksize)
+
+	/*
+	 * NOTE: Don't truncate the range using super->total_bytes.
+	 * Bytenr of btrfs block group is in btrfs logical address space,
+	 * which can be any sector size aligned bytenr in [0, U64_MAX].
+	 */
+	if (range.len < fs_info->sb->s_blocksize)
 		return -EINVAL;
 
-	range.len = min(range.len, total_bytes - range.start);
 	range.minlen = max(range.minlen, minlen);
 	ret = btrfs_trim_fs(fs_info, &range);
 	if (ret < 0)
-- 
2.16.3


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH RESEND 1/2] btrfs: Enhance btrfs_trim_fs function to handle error better
  2017-11-28  7:08 [PATCH RESEND 1/2] btrfs: Enhance btrfs_trim_fs function to handle error better Qu Wenruo
@ 2018-04-01 10:35 ` Qu Wenruo
  0 siblings, 0 replies; 4+ messages in thread
From: Qu Wenruo @ 2018-04-01 10:35 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs; +Cc: dsterba


[-- Attachment #1.1: Type: text/plain, Size: 5068 bytes --]

Gentle ping?

The patch is small and (with its 2nd patch) should fix trim behavior
inside block groups.

Thanks,
Qu

On 2017年11月28日 15:08, Qu Wenruo wrote:
> Function btrfs_trim_fs() doesn't handle errors in a consistent way, if
> error happens when trimming existing block groups, it will skip the
> remaining blocks and continue to trim unallocated space for each device.
> 
> And the return value will only reflect the final error from device
> trimming.
> 
> This patch will fix such behavior by:
> 
> 1) Recording first error from block group or device trimming
>    So return value will also reflect any error found when trimming.
>    Make developer more aware of the problem.
> 
> 2) Outputting btrfs warning message for each trimming failure
>    Any error for block group or device trimming will cause btrfs warning
>    kernel message.
> 
> 3) Continuing trimming if we can
>    If we failed to trim one block group or device, we could still try
>    next block group or device.
> 
> Such behavior can avoid confusion for case like failure to trim the
> first block group and then only unallocated space is trimmed.
> 
> Reported-by: Chris Murphy <lists@colorremedies.com>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  fs/btrfs/extent-tree.c | 59 ++++++++++++++++++++++++++++++++++++--------------
>  1 file changed, 43 insertions(+), 16 deletions(-)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 673ac4e01dd0..f830aa91ac3d 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -10948,6 +10948,16 @@ static int btrfs_trim_free_extents(struct btrfs_device *device,
>  	return ret;
>  }
>  
> +/*
> + * Trim the whole fs, by:
> + * 1) Trimming free space in each block group
> + * 2) Trimming unallocated space in each device
> + *
> + * Will try to continue trimming even if we failed to trim one block group or
> + * device.
> + * The return value will be the error return value of the first error.
> + * Or 0 if nothing wrong happened.
> + */
>  int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
>  {
>  	struct btrfs_block_group_cache *cache = NULL;
> @@ -10958,6 +10968,8 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
>  	u64 end;
>  	u64 trimmed = 0;
>  	u64 total_bytes = btrfs_super_total_bytes(fs_info->super_copy);
> +	int bg_ret = 0;
> +	int dev_ret = 0;
>  	int ret = 0;
>  
>  	/*
> @@ -10968,7 +10980,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
>  	else
>  		cache = btrfs_lookup_block_group(fs_info, range->start);
>  
> -	while (cache) {
> +	for (; cache; cache = next_block_group(fs_info, cache)) {
>  		if (cache->key.objectid >= (range->start + range->len)) {
>  			btrfs_put_block_group(cache);
>  			break;
> @@ -10982,29 +10994,36 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
>  			if (!block_group_cache_done(cache)) {
>  				ret = cache_block_group(cache, 0);
>  				if (ret) {
> -					btrfs_put_block_group(cache);
> -					break;
> +					btrfs_warn_rl(fs_info,
> +		"failed to cache block group %llu ret %d",
> +						   cache->key.objectid, ret);
> +					if (!bg_ret)
> +						bg_ret = ret;
> +					continue;
>  				}
>  				ret = wait_block_group_cache_done(cache);
>  				if (ret) {
> -					btrfs_put_block_group(cache);
> -					break;
> +					btrfs_warn_rl(fs_info,
> +		"failed to wait cache for block group %llu ret %d",
> +						   cache->key.objectid, ret);
> +					if (!bg_ret)
> +						bg_ret = ret;
> +					continue;
>  				}
>  			}
> -			ret = btrfs_trim_block_group(cache,
> -						     &group_trimmed,
> -						     start,
> -						     end,
> -						     range->minlen);
> +			ret = btrfs_trim_block_group(cache, &group_trimmed,
> +						start, end, range->minlen);
>  
>  			trimmed += group_trimmed;
>  			if (ret) {
> -				btrfs_put_block_group(cache);
> -				break;
> +				btrfs_warn_rl(fs_info,
> +		"failed to trim block group %llu ret %d",
> +					   cache->key.objectid, ret);
> +				if (!bg_ret)
> +					bg_ret = ret;
> +				continue;
>  			}
>  		}
> -
> -		cache = next_block_group(fs_info, cache);
>  	}
>  
>  	mutex_lock(&fs_info->fs_devices->device_list_mutex);
> @@ -11012,15 +11031,23 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
>  	list_for_each_entry(device, devices, dev_alloc_list) {
>  		ret = btrfs_trim_free_extents(device, range->minlen,
>  					      &group_trimmed);
> -		if (ret)
> +		if (ret) {
> +			btrfs_warn_rl(fs_info,
> +		"failed to trim unallocated space for devid %llu ret %d",
> +				      device->devid, ret);
> +			if (!dev_ret)
> +				dev_ret = ret;
>  			break;
> +		}
>  
>  		trimmed += group_trimmed;
>  	}
>  	mutex_unlock(&fs_info->fs_devices->device_list_mutex);
>  
>  	range->len = trimmed;
> -	return ret;
> +	if (bg_ret)
> +		return bg_ret;
> +	return dev_ret;
>  }
>  
>  /*
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH RESEND 1/2] btrfs: Enhance btrfs_trim_fs function to handle error better
@ 2017-11-28  7:08 Qu Wenruo
  2018-04-01 10:35 ` Qu Wenruo
  0 siblings, 1 reply; 4+ messages in thread
From: Qu Wenruo @ 2017-11-28  7:08 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba

Function btrfs_trim_fs() doesn't handle errors in a consistent way, if
error happens when trimming existing block groups, it will skip the
remaining blocks and continue to trim unallocated space for each device.

And the return value will only reflect the final error from device
trimming.

This patch will fix such behavior by:

1) Recording first error from block group or device trimming
   So return value will also reflect any error found when trimming.
   Make developer more aware of the problem.

2) Outputting btrfs warning message for each trimming failure
   Any error for block group or device trimming will cause btrfs warning
   kernel message.

3) Continuing trimming if we can
   If we failed to trim one block group or device, we could still try
   next block group or device.

Such behavior can avoid confusion for case like failure to trim the
first block group and then only unallocated space is trimmed.

Reported-by: Chris Murphy <lists@colorremedies.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/extent-tree.c | 59 ++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 43 insertions(+), 16 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 673ac4e01dd0..f830aa91ac3d 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -10948,6 +10948,16 @@ static int btrfs_trim_free_extents(struct btrfs_device *device,
 	return ret;
 }
 
+/*
+ * Trim the whole fs, by:
+ * 1) Trimming free space in each block group
+ * 2) Trimming unallocated space in each device
+ *
+ * Will try to continue trimming even if we failed to trim one block group or
+ * device.
+ * The return value will be the error return value of the first error.
+ * Or 0 if nothing wrong happened.
+ */
 int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 {
 	struct btrfs_block_group_cache *cache = NULL;
@@ -10958,6 +10968,8 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 	u64 end;
 	u64 trimmed = 0;
 	u64 total_bytes = btrfs_super_total_bytes(fs_info->super_copy);
+	int bg_ret = 0;
+	int dev_ret = 0;
 	int ret = 0;
 
 	/*
@@ -10968,7 +10980,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 	else
 		cache = btrfs_lookup_block_group(fs_info, range->start);
 
-	while (cache) {
+	for (; cache; cache = next_block_group(fs_info, cache)) {
 		if (cache->key.objectid >= (range->start + range->len)) {
 			btrfs_put_block_group(cache);
 			break;
@@ -10982,29 +10994,36 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 			if (!block_group_cache_done(cache)) {
 				ret = cache_block_group(cache, 0);
 				if (ret) {
-					btrfs_put_block_group(cache);
-					break;
+					btrfs_warn_rl(fs_info,
+		"failed to cache block group %llu ret %d",
+						   cache->key.objectid, ret);
+					if (!bg_ret)
+						bg_ret = ret;
+					continue;
 				}
 				ret = wait_block_group_cache_done(cache);
 				if (ret) {
-					btrfs_put_block_group(cache);
-					break;
+					btrfs_warn_rl(fs_info,
+		"failed to wait cache for block group %llu ret %d",
+						   cache->key.objectid, ret);
+					if (!bg_ret)
+						bg_ret = ret;
+					continue;
 				}
 			}
-			ret = btrfs_trim_block_group(cache,
-						     &group_trimmed,
-						     start,
-						     end,
-						     range->minlen);
+			ret = btrfs_trim_block_group(cache, &group_trimmed,
+						start, end, range->minlen);
 
 			trimmed += group_trimmed;
 			if (ret) {
-				btrfs_put_block_group(cache);
-				break;
+				btrfs_warn_rl(fs_info,
+		"failed to trim block group %llu ret %d",
+					   cache->key.objectid, ret);
+				if (!bg_ret)
+					bg_ret = ret;
+				continue;
 			}
 		}
-
-		cache = next_block_group(fs_info, cache);
 	}
 
 	mutex_lock(&fs_info->fs_devices->device_list_mutex);
@@ -11012,15 +11031,23 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 	list_for_each_entry(device, devices, dev_alloc_list) {
 		ret = btrfs_trim_free_extents(device, range->minlen,
 					      &group_trimmed);
-		if (ret)
+		if (ret) {
+			btrfs_warn_rl(fs_info,
+		"failed to trim unallocated space for devid %llu ret %d",
+				      device->devid, ret);
+			if (!dev_ret)
+				dev_ret = ret;
 			break;
+		}
 
 		trimmed += group_trimmed;
 	}
 	mutex_unlock(&fs_info->fs_devices->device_list_mutex);
 
 	range->len = trimmed;
-	return ret;
+	if (bg_ret)
+		return bg_ret;
+	return dev_ret;
 }
 
 /*
-- 
2.15.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-04-04  6:15 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-04  6:15 [PATCH RESEND 1/2] btrfs: Enhance btrfs_trim_fs function to handle error better Qu Wenruo
2018-04-04  6:15 ` [PATCH RESEND 2/2] btrfs: Ensure btrfs_trim_fs can trim the whole fs Qu Wenruo
  -- strict thread matches above, loose matches on Subject: below --
2017-11-28  7:08 [PATCH RESEND 1/2] btrfs: Enhance btrfs_trim_fs function to handle error better Qu Wenruo
2018-04-01 10:35 ` Qu Wenruo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.