All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] btrfs_del_csums error handling fixes
@ 2021-05-19 14:52 Josef Bacik
  2021-05-19 14:52 ` [PATCH 1/2] btrfs: fix error handling in btrfs_del_csums Josef Bacik
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Josef Bacik @ 2021-05-19 14:52 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Hello,

Here are two fixes related to deleting csums.  Doing error injection stress
testing I was consistently seeing cases where we had a corrupt file system with
csums that existed without the corresponding extents being written.  This was
occuring because we were losing the return value in two cases, both of which
would result in this style of corruption.  With these two patches I'm no longer
seeing these errors.  Thanks,

Josef

Josef Bacik (2):
  btrfs: fix error handling in btrfs_del_csums
  btrfs: return errors from btrfs_del_csums in cleanup_ref_head

 fs/btrfs/extent-tree.c |  4 ++--
 fs/btrfs/file-item.c   | 10 +++++-----
 2 files changed, 7 insertions(+), 7 deletions(-)

-- 
2.26.3


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] btrfs: fix error handling in btrfs_del_csums
  2021-05-19 14:52 [PATCH 0/2] btrfs_del_csums error handling fixes Josef Bacik
@ 2021-05-19 14:52 ` Josef Bacik
  2021-05-20  1:02   ` Qu Wenruo
  2021-05-19 14:52 ` [PATCH 2/2] btrfs: return errors from btrfs_del_csums in cleanup_ref_head Josef Bacik
  2021-05-21 12:24 ` [PATCH 0/2] btrfs_del_csums error handling fixes David Sterba
  2 siblings, 1 reply; 7+ messages in thread
From: Josef Bacik @ 2021-05-19 14:52 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Error injection stress would sometimes fail with checksums on disk that
did not have a corresponding extent.  This occurred because the pattern
in btrfs_del_csums was

	while (1) {
		ret = btrfs_search_slot();
		if (ret < 0)
			break;
	}
	ret = 0;
out:
	btrfs_free_path(path);
	return ret;

If we got an error from btrfs_search_slot we'd clear the error because
we were breaking instead of goto out.  Instead of using goto out, simply
handle the cases where we may leave a random value in ret, and get rid
of the

	ret = 0;
out:

pattern and simply allow break to have the proper error reporting.  With
this fix we properly abort the transaction and do not commit thinking we
successfully deleted the csum.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/file-item.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c
index 294602f139ef..a5a8dac334e8 100644
--- a/fs/btrfs/file-item.c
+++ b/fs/btrfs/file-item.c
@@ -788,7 +788,7 @@ int btrfs_del_csums(struct btrfs_trans_handle *trans,
 	u64 end_byte = bytenr + len;
 	u64 csum_end;
 	struct extent_buffer *leaf;
-	int ret;
+	int ret = 0;
 	const u32 csum_size = fs_info->csum_size;
 	u32 blocksize_bits = fs_info->sectorsize_bits;
 
@@ -806,6 +806,7 @@ int btrfs_del_csums(struct btrfs_trans_handle *trans,
 
 		ret = btrfs_search_slot(trans, root, &key, path, -1, 1);
 		if (ret > 0) {
+			ret = 0;
 			if (path->slots[0] == 0)
 				break;
 			path->slots[0]--;
@@ -862,7 +863,7 @@ int btrfs_del_csums(struct btrfs_trans_handle *trans,
 			ret = btrfs_del_items(trans, root, path,
 					      path->slots[0], del_nr);
 			if (ret)
-				goto out;
+				break;
 			if (key.offset == bytenr)
 				break;
 		} else if (key.offset < bytenr && csum_end > end_byte) {
@@ -906,8 +907,9 @@ int btrfs_del_csums(struct btrfs_trans_handle *trans,
 			ret = btrfs_split_item(trans, root, path, &key, offset);
 			if (ret && ret != -EAGAIN) {
 				btrfs_abort_transaction(trans, ret);
-				goto out;
+				break;
 			}
+			ret = 0;
 
 			key.offset = end_byte - 1;
 		} else {
@@ -917,8 +919,6 @@ int btrfs_del_csums(struct btrfs_trans_handle *trans,
 		}
 		btrfs_release_path(path);
 	}
-	ret = 0;
-out:
 	btrfs_free_path(path);
 	return ret;
 }
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] btrfs: return errors from btrfs_del_csums in cleanup_ref_head
  2021-05-19 14:52 [PATCH 0/2] btrfs_del_csums error handling fixes Josef Bacik
  2021-05-19 14:52 ` [PATCH 1/2] btrfs: fix error handling in btrfs_del_csums Josef Bacik
@ 2021-05-19 14:52 ` Josef Bacik
  2021-05-20  1:02   ` Qu Wenruo
  2021-05-21 12:20   ` David Sterba
  2021-05-21 12:24 ` [PATCH 0/2] btrfs_del_csums error handling fixes David Sterba
  2 siblings, 2 replies; 7+ messages in thread
From: Josef Bacik @ 2021-05-19 14:52 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We are unconditionally returning 0 in cleanup_ref_head, despite the fact
that btrfs_del_csums could fail.  We need to return the error so the
transaction gets aborted properly, fix this by returning ret from
btrfs_del_csums in cleanup_ref_head.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent-tree.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index b84bbc24ff57..790de24576ac 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -1826,7 +1826,7 @@ static int cleanup_ref_head(struct btrfs_trans_handle *trans,
 
 	struct btrfs_fs_info *fs_info = trans->fs_info;
 	struct btrfs_delayed_ref_root *delayed_refs;
-	int ret;
+	int ret = 0;
 
 	delayed_refs = &trans->transaction->delayed_refs;
 
@@ -1868,7 +1868,7 @@ static int cleanup_ref_head(struct btrfs_trans_handle *trans,
 	trace_run_delayed_ref_head(fs_info, head, 0);
 	btrfs_delayed_ref_unlock(head);
 	btrfs_put_delayed_ref_head(head);
-	return 0;
+	return ret;
 }
 
 static struct btrfs_delayed_ref_head *btrfs_obtain_ref_head(
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] btrfs: fix error handling in btrfs_del_csums
  2021-05-19 14:52 ` [PATCH 1/2] btrfs: fix error handling in btrfs_del_csums Josef Bacik
@ 2021-05-20  1:02   ` Qu Wenruo
  0 siblings, 0 replies; 7+ messages in thread
From: Qu Wenruo @ 2021-05-20  1:02 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 2021/5/19 下午10:52, Josef Bacik wrote:
> Error injection stress would sometimes fail with checksums on disk that
> did not have a corresponding extent.  This occurred because the pattern
> in btrfs_del_csums was
>
> 	while (1) {
> 		ret = btrfs_search_slot();
> 		if (ret < 0)
> 			break;
> 	}
> 	ret = 0;
> out:

Such "ret = 0;" is definitely causing problem when break is used.

> 	btrfs_free_path(path);
> 	return ret;
>
> If we got an error from btrfs_search_slot we'd clear the error because
> we were breaking instead of goto out.  Instead of using goto out, simply
> handle the cases where we may leave a random value in ret, and get rid
> of the
>
> 	ret = 0;
> out:
>
> pattern and simply allow break to have the proper error reporting.  With
> this fix we properly abort the transaction and do not commit thinking we
> successfully deleted the csum.
>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu

> ---
>   fs/btrfs/file-item.c | 10 +++++-----
>   1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c
> index 294602f139ef..a5a8dac334e8 100644
> --- a/fs/btrfs/file-item.c
> +++ b/fs/btrfs/file-item.c
> @@ -788,7 +788,7 @@ int btrfs_del_csums(struct btrfs_trans_handle *trans,
>   	u64 end_byte = bytenr + len;
>   	u64 csum_end;
>   	struct extent_buffer *leaf;
> -	int ret;
> +	int ret = 0;
>   	const u32 csum_size = fs_info->csum_size;
>   	u32 blocksize_bits = fs_info->sectorsize_bits;
>
> @@ -806,6 +806,7 @@ int btrfs_del_csums(struct btrfs_trans_handle *trans,
>
>   		ret = btrfs_search_slot(trans, root, &key, path, -1, 1);
>   		if (ret > 0) {
> +			ret = 0;
>   			if (path->slots[0] == 0)
>   				break;
>   			path->slots[0]--;
> @@ -862,7 +863,7 @@ int btrfs_del_csums(struct btrfs_trans_handle *trans,
>   			ret = btrfs_del_items(trans, root, path,
>   					      path->slots[0], del_nr);
>   			if (ret)
> -				goto out;
> +				break;
>   			if (key.offset == bytenr)
>   				break;
>   		} else if (key.offset < bytenr && csum_end > end_byte) {
> @@ -906,8 +907,9 @@ int btrfs_del_csums(struct btrfs_trans_handle *trans,
>   			ret = btrfs_split_item(trans, root, path, &key, offset);
>   			if (ret && ret != -EAGAIN) {
>   				btrfs_abort_transaction(trans, ret);
> -				goto out;
> +				break;
>   			}
> +			ret = 0;
>
>   			key.offset = end_byte - 1;
>   		} else {
> @@ -917,8 +919,6 @@ int btrfs_del_csums(struct btrfs_trans_handle *trans,
>   		}
>   		btrfs_release_path(path);
>   	}
> -	ret = 0;
> -out:
>   	btrfs_free_path(path);
>   	return ret;
>   }
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] btrfs: return errors from btrfs_del_csums in cleanup_ref_head
  2021-05-19 14:52 ` [PATCH 2/2] btrfs: return errors from btrfs_del_csums in cleanup_ref_head Josef Bacik
@ 2021-05-20  1:02   ` Qu Wenruo
  2021-05-21 12:20   ` David Sterba
  1 sibling, 0 replies; 7+ messages in thread
From: Qu Wenruo @ 2021-05-20  1:02 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 2021/5/19 下午10:52, Josef Bacik wrote:
> We are unconditionally returning 0 in cleanup_ref_head, despite the fact
> that btrfs_del_csums could fail.  We need to return the error so the
> transaction gets aborted properly, fix this by returning ret from
> btrfs_del_csums in cleanup_ref_head.
>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu
> ---
>   fs/btrfs/extent-tree.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index b84bbc24ff57..790de24576ac 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -1826,7 +1826,7 @@ static int cleanup_ref_head(struct btrfs_trans_handle *trans,
>
>   	struct btrfs_fs_info *fs_info = trans->fs_info;
>   	struct btrfs_delayed_ref_root *delayed_refs;
> -	int ret;
> +	int ret = 0;
>
>   	delayed_refs = &trans->transaction->delayed_refs;
>
> @@ -1868,7 +1868,7 @@ static int cleanup_ref_head(struct btrfs_trans_handle *trans,
>   	trace_run_delayed_ref_head(fs_info, head, 0);
>   	btrfs_delayed_ref_unlock(head);
>   	btrfs_put_delayed_ref_head(head);
> -	return 0;
> +	return ret;
>   }
>
>   static struct btrfs_delayed_ref_head *btrfs_obtain_ref_head(
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] btrfs: return errors from btrfs_del_csums in cleanup_ref_head
  2021-05-19 14:52 ` [PATCH 2/2] btrfs: return errors from btrfs_del_csums in cleanup_ref_head Josef Bacik
  2021-05-20  1:02   ` Qu Wenruo
@ 2021-05-21 12:20   ` David Sterba
  1 sibling, 0 replies; 7+ messages in thread
From: David Sterba @ 2021-05-21 12:20 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Wed, May 19, 2021 at 10:52:46AM -0400, Josef Bacik wrote:
> We are unconditionally returning 0 in cleanup_ref_head, despite the fact
> that btrfs_del_csums could fail.  We need to return the error so the
> transaction gets aborted properly, fix this by returning ret from
> btrfs_del_csums in cleanup_ref_head.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/extent-tree.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index b84bbc24ff57..790de24576ac 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -1826,7 +1826,7 @@ static int cleanup_ref_head(struct btrfs_trans_handle *trans,
>  
>  	struct btrfs_fs_info *fs_info = trans->fs_info;
>  	struct btrfs_delayed_ref_root *delayed_refs;
> -	int ret;
> +	int ret = 0;

>  
>  	delayed_refs = &trans->transaction->delayed_refs;

ret is used for a return just after this line

	ret = run_and_cleanup_extent_op(trans, head);

so not necessary to initialize at the declaration

>  
> @@ -1868,7 +1868,7 @@ static int cleanup_ref_head(struct btrfs_trans_handle *trans,
>  	trace_run_delayed_ref_head(fs_info, head, 0);
>  	btrfs_delayed_ref_unlock(head);
>  	btrfs_put_delayed_ref_head(head);
> -	return 0;
> +	return ret;
>  }
>  
>  static struct btrfs_delayed_ref_head *btrfs_obtain_ref_head(
> -- 
> 2.26.3

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] btrfs_del_csums error handling fixes
  2021-05-19 14:52 [PATCH 0/2] btrfs_del_csums error handling fixes Josef Bacik
  2021-05-19 14:52 ` [PATCH 1/2] btrfs: fix error handling in btrfs_del_csums Josef Bacik
  2021-05-19 14:52 ` [PATCH 2/2] btrfs: return errors from btrfs_del_csums in cleanup_ref_head Josef Bacik
@ 2021-05-21 12:24 ` David Sterba
  2 siblings, 0 replies; 7+ messages in thread
From: David Sterba @ 2021-05-21 12:24 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Wed, May 19, 2021 at 10:52:44AM -0400, Josef Bacik wrote:
> Hello,
> 
> Here are two fixes related to deleting csums.  Doing error injection stress
> testing I was consistently seeing cases where we had a corrupt file system with
> csums that existed without the corresponding extents being written.  This was
> occuring because we were losing the return value in two cases, both of which
> would result in this style of corruption.  With these two patches I'm no longer
> seeing these errors.  Thanks,
> 
> Josef
> 
> Josef Bacik (2):
>   btrfs: fix error handling in btrfs_del_csums
>   btrfs: return errors from btrfs_del_csums in cleanup_ref_head

Added to misc-next, thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-05-21 12:27 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-19 14:52 [PATCH 0/2] btrfs_del_csums error handling fixes Josef Bacik
2021-05-19 14:52 ` [PATCH 1/2] btrfs: fix error handling in btrfs_del_csums Josef Bacik
2021-05-20  1:02   ` Qu Wenruo
2021-05-19 14:52 ` [PATCH 2/2] btrfs: return errors from btrfs_del_csums in cleanup_ref_head Josef Bacik
2021-05-20  1:02   ` Qu Wenruo
2021-05-21 12:20   ` David Sterba
2021-05-21 12:24 ` [PATCH 0/2] btrfs_del_csums error handling fixes David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.