All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] btrfs: fix error handling in commit_fs_roots
@ 2020-12-01 14:53 Josef Bacik
  2020-12-01 18:08 ` Nikolay Borisov
  2020-12-04 16:52 ` David Sterba
  0 siblings, 2 replies; 4+ messages in thread
From: Josef Bacik @ 2020-12-01 14:53 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

While doing error injection I would sometimes get a corrupt file system.
This is because I was injecting errors at btrfs_search_slot, but would
only do it one time per stack.  This uncovered a problem in
commit_fs_roots, where if we get an error we would just break.  However
we're in a second loop, the first loop being a loop to find all the
dirty fs roots, and then subsequent root updates would succeed clearing
the error value.

This isn't likely to happen in real scenarios, however we could
potentially get a random ENOMEM once and then not again, and we'd end up
with a corrupted file system.  Fix this by moving the error checking
around a bit to the main loop, as this is the only place where something
will fail, and return the error as soon as it occurs.

With this patch my reproducer no longer corrupts the file system.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/transaction.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 1dac76b7ea96..b05f75654b16 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -1328,7 +1328,6 @@ static noinline int commit_fs_roots(struct btrfs_trans_handle *trans)
 	struct btrfs_root *gang[8];
 	int i;
 	int ret;
-	int err = 0;
 
 	spin_lock(&fs_info->fs_roots_radix_lock);
 	while (1) {
@@ -1340,6 +1339,8 @@ static noinline int commit_fs_roots(struct btrfs_trans_handle *trans)
 			break;
 		for (i = 0; i < ret; i++) {
 			struct btrfs_root *root = gang[i];
+			int err;
+
 			radix_tree_tag_clear(&fs_info->fs_roots_radix,
 					(unsigned long)root->root_key.objectid,
 					BTRFS_ROOT_TRANS_TAG);
@@ -1366,14 +1367,14 @@ static noinline int commit_fs_roots(struct btrfs_trans_handle *trans)
 			err = btrfs_update_root(trans, fs_info->tree_root,
 						&root->root_key,
 						&root->root_item);
-			spin_lock(&fs_info->fs_roots_radix_lock);
 			if (err)
-				break;
+				return err;
+			spin_lock(&fs_info->fs_roots_radix_lock);
 			btrfs_qgroup_free_meta_all_pertrans(root);
 		}
 	}
 	spin_unlock(&fs_info->fs_roots_radix_lock);
-	return err;
+	return 0;
 }
 
 /*
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] btrfs: fix error handling in commit_fs_roots
  2020-12-01 14:53 [PATCH] btrfs: fix error handling in commit_fs_roots Josef Bacik
@ 2020-12-01 18:08 ` Nikolay Borisov
  2020-12-04 16:52 ` David Sterba
  1 sibling, 0 replies; 4+ messages in thread
From: Nikolay Borisov @ 2020-12-01 18:08 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 1.12.20 г. 16:53 ч., Josef Bacik wrote:
> While doing error injection I would sometimes get a corrupt file system.
> This is because I was injecting errors at btrfs_search_slot, but would
> only do it one time per stack.  This uncovered a problem in
> commit_fs_roots, where if we get an error we would just break.  However
> we're in a second loop, the first loop being a loop to find all the

nit: s/second/nested as initially I go confused about us being in a 2nd
loop iteration. Using nested makes it a bit  clearer

> dirty fs roots, and then subsequent root updates would succeed clearing
> the error value.
> 
> This isn't likely to happen in real scenarios, however we could
> potentially get a random ENOMEM once and then not again, and we'd end up
> with a corrupted file system.  Fix this by moving the error checking
> around a bit to the main loop, as this is the only place where something
> will fail, and return the error as soon as it occurs.
> 
> With this patch my reproducer no longer corrupts the file system.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/transaction.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index 1dac76b7ea96..b05f75654b16 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -1328,7 +1328,6 @@ static noinline int commit_fs_roots(struct btrfs_trans_handle *trans)
>  	struct btrfs_root *gang[8];
>  	int i;
>  	int ret;
> -	int err = 0;
>  
>  	spin_lock(&fs_info->fs_roots_radix_lock);
>  	while (1) {
> @@ -1340,6 +1339,8 @@ static noinline int commit_fs_roots(struct btrfs_trans_handle *trans)
>  			break;
>  		for (i = 0; i < ret; i++) {
>  			struct btrfs_root *root = gang[i];
> +			int err;
> +
>  			radix_tree_tag_clear(&fs_info->fs_roots_radix,
>  					(unsigned long)root->root_key.objectid,
>  					BTRFS_ROOT_TRANS_TAG);
> @@ -1366,14 +1367,14 @@ static noinline int commit_fs_roots(struct btrfs_trans_handle *trans)
>  			err = btrfs_update_root(trans, fs_info->tree_root,
>  						&root->root_key,
>  						&root->root_item);
> -			spin_lock(&fs_info->fs_roots_radix_lock);
>  			if (err)
> -				break;
> +				return err;
> +			spin_lock(&fs_info->fs_roots_radix_lock);
>  			btrfs_qgroup_free_meta_all_pertrans(root);
>  		}
>  	}
>  	spin_unlock(&fs_info->fs_roots_radix_lock);
> -	return err;
> +	return 0;
>  }
>  
>  /*
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] btrfs: fix error handling in commit_fs_roots
  2020-12-01 14:53 [PATCH] btrfs: fix error handling in commit_fs_roots Josef Bacik
  2020-12-01 18:08 ` Nikolay Borisov
@ 2020-12-04 16:52 ` David Sterba
  2020-12-04 19:52   ` Josef Bacik
  1 sibling, 1 reply; 4+ messages in thread
From: David Sterba @ 2020-12-04 16:52 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Tue, Dec 01, 2020 at 09:53:23AM -0500, Josef Bacik wrote:
> While doing error injection I would sometimes get a corrupt file system.
> This is because I was injecting errors at btrfs_search_slot, but would
> only do it one time per stack.  This uncovered a problem in
> commit_fs_roots, where if we get an error we would just break.  However
> we're in a second loop, the first loop being a loop to find all the
> dirty fs roots, and then subsequent root updates would succeed clearing
> the error value.
> 
> This isn't likely to happen in real scenarios, however we could
> potentially get a random ENOMEM once and then not again, and we'd end up
> with a corrupted file system.  Fix this by moving the error checking
> around a bit to the main loop, as this is the only place where something
> will fail, and return the error as soon as it occurs.
> 
> With this patch my reproducer no longer corrupts the file system.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/transaction.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index 1dac76b7ea96..b05f75654b16 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -1328,7 +1328,6 @@ static noinline int commit_fs_roots(struct btrfs_trans_handle *trans)
>  	struct btrfs_root *gang[8];
>  	int i;
>  	int ret;
> -	int err = 0;
>  
>  	spin_lock(&fs_info->fs_roots_radix_lock);
>  	while (1) {
> @@ -1340,6 +1339,8 @@ static noinline int commit_fs_roots(struct btrfs_trans_handle *trans)
>  			break;
>  		for (i = 0; i < ret; i++) {
>  			struct btrfs_root *root = gang[i];
> +			int err;

I'd rather get rid of 'err' for the return values, in this case we can
reuse 'ret'.

> +
>  			radix_tree_tag_clear(&fs_info->fs_roots_radix,
>  					(unsigned long)root->root_key.objectid,
>  					BTRFS_ROOT_TRANS_TAG);
> @@ -1366,14 +1367,14 @@ static noinline int commit_fs_roots(struct btrfs_trans_handle *trans)
>  			err = btrfs_update_root(trans, fs_info->tree_root,
>  						&root->root_key,
>  						&root->root_item);
> -			spin_lock(&fs_info->fs_roots_radix_lock);
>  			if (err)
> -				break;
> +				return err;
> +			spin_lock(&fs_info->fs_roots_radix_lock);
>  			btrfs_qgroup_free_meta_all_pertrans(root);

Do we need to call btrfs_qgroup_free_meta_all_pertrans before returning?

>  		}
>  	}
>  	spin_unlock(&fs_info->fs_roots_radix_lock);
> -	return err;
> +	return 0;
>  }
>  
>  /*
> -- 
> 2.26.2

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] btrfs: fix error handling in commit_fs_roots
  2020-12-04 16:52 ` David Sterba
@ 2020-12-04 19:52   ` Josef Bacik
  0 siblings, 0 replies; 4+ messages in thread
From: Josef Bacik @ 2020-12-04 19:52 UTC (permalink / raw)
  To: dsterba, linux-btrfs, kernel-team

On 12/4/20 11:52 AM, David Sterba wrote:
> On Tue, Dec 01, 2020 at 09:53:23AM -0500, Josef Bacik wrote:
>> While doing error injection I would sometimes get a corrupt file system.
>> This is because I was injecting errors at btrfs_search_slot, but would
>> only do it one time per stack.  This uncovered a problem in
>> commit_fs_roots, where if we get an error we would just break.  However
>> we're in a second loop, the first loop being a loop to find all the
>> dirty fs roots, and then subsequent root updates would succeed clearing
>> the error value.
>>
>> This isn't likely to happen in real scenarios, however we could
>> potentially get a random ENOMEM once and then not again, and we'd end up
>> with a corrupted file system.  Fix this by moving the error checking
>> around a bit to the main loop, as this is the only place where something
>> will fail, and return the error as soon as it occurs.
>>
>> With this patch my reproducer no longer corrupts the file system.
>>
>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
>> ---
>>   fs/btrfs/transaction.c | 9 +++++----
>>   1 file changed, 5 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
>> index 1dac76b7ea96..b05f75654b16 100644
>> --- a/fs/btrfs/transaction.c
>> +++ b/fs/btrfs/transaction.c
>> @@ -1328,7 +1328,6 @@ static noinline int commit_fs_roots(struct btrfs_trans_handle *trans)
>>   	struct btrfs_root *gang[8];
>>   	int i;
>>   	int ret;
>> -	int err = 0;
>>   
>>   	spin_lock(&fs_info->fs_roots_radix_lock);
>>   	while (1) {
>> @@ -1340,6 +1339,8 @@ static noinline int commit_fs_roots(struct btrfs_trans_handle *trans)
>>   			break;
>>   		for (i = 0; i < ret; i++) {
>>   			struct btrfs_root *root = gang[i];
>> +			int err;
> 
> I'd rather get rid of 'err' for the return values, in this case we can
> reuse 'ret'.
> 

Sure, I'll fix and respin.

>> +
>>   			radix_tree_tag_clear(&fs_info->fs_roots_radix,
>>   					(unsigned long)root->root_key.objectid,
>>   					BTRFS_ROOT_TRANS_TAG);
>> @@ -1366,14 +1367,14 @@ static noinline int commit_fs_roots(struct btrfs_trans_handle *trans)
>>   			err = btrfs_update_root(trans, fs_info->tree_root,
>>   						&root->root_key,
>>   						&root->root_item);
>> -			spin_lock(&fs_info->fs_roots_radix_lock);
>>   			if (err)
>> -				break;
>> +				return err;
>> +			spin_lock(&fs_info->fs_roots_radix_lock);
>>   			btrfs_qgroup_free_meta_all_pertrans(root);
> 
> Do we need to call btrfs_qgroup_free_meta_all_pertrans before returning?
> 

It doesn't look like it, and we'd miss any existing roots if we did.  It doesn't 
appear that any qgroup accounting gets cleaned up in the case of an error, so it 
must work out already?  If not then we need to address that separately, because 
if we're relying on the cleanup to happen here we'll mess up if the error 
happens before we even get to commit_fs_roots.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-12-04 19:53 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-01 14:53 [PATCH] btrfs: fix error handling in commit_fs_roots Josef Bacik
2020-12-01 18:08 ` Nikolay Borisov
2020-12-04 16:52 ` David Sterba
2020-12-04 19:52   ` Josef Bacik

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.