linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] Error condition failure fixes
@ 2020-02-11 21:40 Josef Bacik
  2020-02-11 21:40 ` [PATCH 1/4] btrfs: set fs_root = NULL on error Josef Bacik
                   ` (3 more replies)
  0 siblings, 4 replies; 20+ messages in thread
From: Josef Bacik @ 2020-02-11 21:40 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

I've been running my bpf error injection stress script and been fixing what has
fallen out.  I don't think I've fixed everything yet, but to reduce the noise
from Dave's testing here's the current set of fixes I have.  These are based on
misc-next from late last week, but should apply cleanly to the recent set.
These aren't super important, but will cut down on the noise from things like
generic/019 and generic/475.  Thanks,

Josef


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/4] btrfs: set fs_root = NULL on error
  2020-02-11 21:40 [PATCH 0/4] Error condition failure fixes Josef Bacik
@ 2020-02-11 21:40 ` Josef Bacik
  2020-02-12  0:55   ` Qu Wenruo
                     ` (2 more replies)
  2020-02-11 21:40 ` [PATCH 2/4] btrfs: do not check delayed items are empty for single trans cleanup Josef Bacik
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 20+ messages in thread
From: Josef Bacik @ 2020-02-11 21:40 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

While running my error injection script I hit a panic when we tried to
clean up the fs_root when free'ing the fs_root.  This is because
fs_info->fs_root == PTR_ERR(-EIO), which isn't great.  Fix this by
setting fs_info->fs_root = NULL; if we fail to read the root.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/disk-io.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index eb441fa3711b..5b6140482cef 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3260,6 +3260,7 @@ int __cold open_ctree(struct super_block *sb,
 	if (IS_ERR(fs_info->fs_root)) {
 		err = PTR_ERR(fs_info->fs_root);
 		btrfs_warn(fs_info, "failed to read fs tree: %d", err);
+		fs_info->fs_root = NULL;
 		goto fail_qgroup;
 	}
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/4] btrfs: do not check delayed items are empty for single trans cleanup
  2020-02-11 21:40 [PATCH 0/4] Error condition failure fixes Josef Bacik
  2020-02-11 21:40 ` [PATCH 1/4] btrfs: set fs_root = NULL on error Josef Bacik
@ 2020-02-11 21:40 ` Josef Bacik
  2020-02-12  0:57   ` Qu Wenruo
                     ` (2 more replies)
  2020-02-11 21:40 ` [PATCH 3/4] btrfs: handle logged extent failure properly Josef Bacik
  2020-02-11 21:40 ` [PATCH 4/4] btrfs: fix bytes_may_use underflow in prealloc error condtition Josef Bacik
  3 siblings, 3 replies; 20+ messages in thread
From: Josef Bacik @ 2020-02-11 21:40 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

btrfs_assert_delayed_root_empty() will check if the delayed root is
completely empty, but this is a fs wide check.  On cleanup we may have
allowed other transactions to begin, for whatever reason, and thus the
delayed root is not empty.  So remove this check from
cleanup_one_transation().  This however can stay in
btrfs_cleanup_transaction(), because it checks only after all of the
transactions have been properly cleaned up, and thus is valid.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/disk-io.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 5b6140482cef..601ed3335cf6 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -4543,7 +4543,6 @@ void btrfs_cleanup_one_transaction(struct btrfs_transaction *cur_trans,
 	wake_up(&fs_info->transaction_wait);
 
 	btrfs_destroy_delayed_inodes(fs_info);
-	btrfs_assert_delayed_root_empty(fs_info);
 
 	btrfs_destroy_marked_extents(fs_info, &cur_trans->dirty_pages,
 				     EXTENT_DIRTY);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/4] btrfs: handle logged extent failure properly
  2020-02-11 21:40 [PATCH 0/4] Error condition failure fixes Josef Bacik
  2020-02-11 21:40 ` [PATCH 1/4] btrfs: set fs_root = NULL on error Josef Bacik
  2020-02-11 21:40 ` [PATCH 2/4] btrfs: do not check delayed items are empty for single trans cleanup Josef Bacik
@ 2020-02-11 21:40 ` Josef Bacik
  2020-02-12  0:58   ` Qu Wenruo
                     ` (2 more replies)
  2020-02-11 21:40 ` [PATCH 4/4] btrfs: fix bytes_may_use underflow in prealloc error condtition Josef Bacik
  3 siblings, 3 replies; 20+ messages in thread
From: Josef Bacik @ 2020-02-11 21:40 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

If we're allocating a logged extent we attempt to insert an extent
record for the file extent directly.  We increase
space_info->bytes_reserved, because the extent entry addition will call
btrfs_update_block_group(), which will convert the ->bytes_reserved to
->bytes_used.  However if we fail at any point while inserting the
extent entry we will bail and leave space on ->bytes_reserved, which
will trigger a WARN_ON() on umount.  Fix this by pinning the space if we
fail to insert, which is what happens in every other failure case that
involves adding the extent entry.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent-tree.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index c43acb329fa6..2b4c3ca5e651 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -4430,6 +4430,8 @@ int btrfs_alloc_logged_file_extent(struct btrfs_trans_handle *trans,
 
 	ret = alloc_reserved_file_extent(trans, 0, root_objectid, 0, owner,
 					 offset, ins, 1);
+	if (ret)
+		btrfs_pin_extent(fs_info, ins->objectid, ins->offset, 1);
 	btrfs_put_block_group(block_group);
 	return ret;
 }
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 4/4] btrfs: fix bytes_may_use underflow in prealloc error condtition
  2020-02-11 21:40 [PATCH 0/4] Error condition failure fixes Josef Bacik
                   ` (2 preceding siblings ...)
  2020-02-11 21:40 ` [PATCH 3/4] btrfs: handle logged extent failure properly Josef Bacik
@ 2020-02-11 21:40 ` Josef Bacik
  2020-02-12  1:00   ` Qu Wenruo
  2020-02-13 10:17   ` Nikolay Borisov
  3 siblings, 2 replies; 20+ messages in thread
From: Josef Bacik @ 2020-02-11 21:40 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

I hit the following warning while running my error injection stress testing

------------[ cut here ]------------
WARNING: CPU: 3 PID: 1453 at fs/btrfs/space-info.h:108 btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs]
RIP: 0010:btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs]
Call Trace:
btrfs_free_reserved_data_space+0x4f/0x70 [btrfs]
__btrfs_prealloc_file_range+0x378/0x470 [btrfs]
elfcorehdr_read+0x40/0x40
? elfcorehdr_read+0x40/0x40
? btrfs_commit_transaction+0xca/0xa50 [btrfs]
? dput+0xb4/0x2a0
? btrfs_log_dentry_safe+0x55/0x70 [btrfs]
? btrfs_sync_file+0x30e/0x420 [btrfs]
? do_fsync+0x38/0x70
? __x64_sys_fdatasync+0x13/0x20
? do_syscall_64+0x5b/0x1b0
? entry_SYSCALL_64_after_hwframe+0x44/0xa9
---[ end trace 70ccb5d0fe51151c ]---

This happens if we fail to insert our reserved file extent.  At this
point we've already converted our reservation from ->bytes_may_use to
->bytes_reserved.  However once we break we will attempt to free
everything from [cur_offset, end] from ->bytes_may_use, but our extent
reservation will overlap part of this.

Fix this problem by adding ins.offset (our extent allocation size) to
cur_offset so we remove the actual remaining part from ->bytes_may_use.

I validated this fix using my inject-error.py script

python inject-error.py -o should_fail_bio -t cache_save_setup -t \
	__btrfs_prealloc_file_range \
	-t insert_reserved_file_extent.constprop.0 \
	-r "-5" ./run-fsstress.sh

where run-fsstress.sh simply mounts and runs fsstress on a disk.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/inode.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 84e649724549..747d860aedf6 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -9919,6 +9919,14 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode,
 						  ins.offset, 0, 0, 0,
 						  BTRFS_FILE_EXTENT_PREALLOC);
 		if (ret) {
+			/*
+			 * We've reserved this space, and thus converted it from
+			 * ->bytes_may_use to ->bytes_reserved, which we cleanup
+			 * here.  We need to adjust cur_offset so that we only
+			 * drop the ->bytes_may_use for the area we still have
+			 * remaining in ->>bytes_may_use.
+			 */
+			cur_offset += ins.objectid;
 			btrfs_free_reserved_extent(fs_info, ins.objectid,
 						   ins.offset, 0);
 			btrfs_abort_transaction(trans, ret);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/4] btrfs: set fs_root = NULL on error
  2020-02-11 21:40 ` [PATCH 1/4] btrfs: set fs_root = NULL on error Josef Bacik
@ 2020-02-12  0:55   ` Qu Wenruo
  2020-02-13 10:05   ` Johannes Thumshirn
  2020-02-13 10:48   ` Nikolay Borisov
  2 siblings, 0 replies; 20+ messages in thread
From: Qu Wenruo @ 2020-02-12  0:55 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1228 bytes --]



On 2020/2/12 上午5:40, Josef Bacik wrote:
> While running my error injection script I hit a panic when we tried to
> clean up the fs_root when free'ing the fs_root.  This is because
> fs_info->fs_root == PTR_ERR(-EIO), which isn't great.  Fix this by
> setting fs_info->fs_root = NULL; if we fail to read the root.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Just one off-topic idea, can we have test cases in fstests to do
specific error injection test?

For your fix, we can inject ENOMEM error with call chain
btrfs_read_fs_root_no_name()->open_ctree() to get a 100% reproducible
test, which looks to be a solid test case.

Thanks,
Qu

> ---
>  fs/btrfs/disk-io.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index eb441fa3711b..5b6140482cef 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -3260,6 +3260,7 @@ int __cold open_ctree(struct super_block *sb,
>  	if (IS_ERR(fs_info->fs_root)) {
>  		err = PTR_ERR(fs_info->fs_root);
>  		btrfs_warn(fs_info, "failed to read fs tree: %d", err);
> +		fs_info->fs_root = NULL;
>  		goto fail_qgroup;
>  	}
>  
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/4] btrfs: do not check delayed items are empty for single trans cleanup
  2020-02-11 21:40 ` [PATCH 2/4] btrfs: do not check delayed items are empty for single trans cleanup Josef Bacik
@ 2020-02-12  0:57   ` Qu Wenruo
  2020-02-13 10:28   ` Nikolay Borisov
  2020-02-13 11:15   ` Johannes Thumshirn
  2 siblings, 0 replies; 20+ messages in thread
From: Qu Wenruo @ 2020-02-12  0:57 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1401 bytes --]



On 2020/2/12 上午5:40, Josef Bacik wrote:
> btrfs_assert_delayed_root_empty() will check if the delayed root is
> completely empty, but this is a fs wide check.  On cleanup we may have
> allowed other transactions to begin, for whatever reason, and thus the
> delayed root is not empty.  So remove this check from
> cleanup_one_transation().  This however can stay in
> btrfs_cleanup_transaction(), because it checks only after all of the
> transactions have been properly cleaned up, and thus is valid.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Just a nitpick, to allow other user to verify the fix, would you mind to
provide a specific reproducer?
Like the error injection (I guess it's still memory allocation failure),
the call chain.

Thanks,
Qu
> ---
>  fs/btrfs/disk-io.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 5b6140482cef..601ed3335cf6 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -4543,7 +4543,6 @@ void btrfs_cleanup_one_transaction(struct btrfs_transaction *cur_trans,
>  	wake_up(&fs_info->transaction_wait);
>  
>  	btrfs_destroy_delayed_inodes(fs_info);
> -	btrfs_assert_delayed_root_empty(fs_info);
>  
>  	btrfs_destroy_marked_extents(fs_info, &cur_trans->dirty_pages,
>  				     EXTENT_DIRTY);
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/4] btrfs: handle logged extent failure properly
  2020-02-11 21:40 ` [PATCH 3/4] btrfs: handle logged extent failure properly Josef Bacik
@ 2020-02-12  0:58   ` Qu Wenruo
  2020-02-13 10:26   ` Nikolay Borisov
  2020-02-13 11:16   ` Johannes Thumshirn
  2 siblings, 0 replies; 20+ messages in thread
From: Qu Wenruo @ 2020-02-12  0:58 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1364 bytes --]



On 2020/2/12 上午5:40, Josef Bacik wrote:
> If we're allocating a logged extent we attempt to insert an extent
> record for the file extent directly.  We increase
> space_info->bytes_reserved, because the extent entry addition will call
> btrfs_update_block_group(), which will convert the ->bytes_reserved to
> ->bytes_used.  However if we fail at any point while inserting the
> extent entry we will bail and leave space on ->bytes_reserved, which
> will trigger a WARN_ON() on umount.  Fix this by pinning the space if we
> fail to insert, which is what happens in every other failure case that
> involves adding the extent entry.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu

> ---
>  fs/btrfs/extent-tree.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index c43acb329fa6..2b4c3ca5e651 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -4430,6 +4430,8 @@ int btrfs_alloc_logged_file_extent(struct btrfs_trans_handle *trans,
>  
>  	ret = alloc_reserved_file_extent(trans, 0, root_objectid, 0, owner,
>  					 offset, ins, 1);
> +	if (ret)
> +		btrfs_pin_extent(fs_info, ins->objectid, ins->offset, 1);
>  	btrfs_put_block_group(block_group);
>  	return ret;
>  }
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] btrfs: fix bytes_may_use underflow in prealloc error condtition
  2020-02-11 21:40 ` [PATCH 4/4] btrfs: fix bytes_may_use underflow in prealloc error condtition Josef Bacik
@ 2020-02-12  1:00   ` Qu Wenruo
  2020-02-13 10:17   ` Nikolay Borisov
  1 sibling, 0 replies; 20+ messages in thread
From: Qu Wenruo @ 2020-02-12  1:00 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 2687 bytes --]



On 2020/2/12 上午5:40, Josef Bacik wrote:
> I hit the following warning while running my error injection stress testing
> 
> ------------[ cut here ]------------
> WARNING: CPU: 3 PID: 1453 at fs/btrfs/space-info.h:108 btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs]
> RIP: 0010:btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs]
> Call Trace:
> btrfs_free_reserved_data_space+0x4f/0x70 [btrfs]
> __btrfs_prealloc_file_range+0x378/0x470 [btrfs]
> elfcorehdr_read+0x40/0x40
> ? elfcorehdr_read+0x40/0x40
> ? btrfs_commit_transaction+0xca/0xa50 [btrfs]
> ? dput+0xb4/0x2a0
> ? btrfs_log_dentry_safe+0x55/0x70 [btrfs]
> ? btrfs_sync_file+0x30e/0x420 [btrfs]
> ? do_fsync+0x38/0x70
> ? __x64_sys_fdatasync+0x13/0x20
> ? do_syscall_64+0x5b/0x1b0
> ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
> ---[ end trace 70ccb5d0fe51151c ]---
> 
> This happens if we fail to insert our reserved file extent.  At this
> point we've already converted our reservation from ->bytes_may_use to
> ->bytes_reserved.  However once we break we will attempt to free
> everything from [cur_offset, end] from ->bytes_may_use, but our extent
> reservation will overlap part of this.
> 
> Fix this problem by adding ins.offset (our extent allocation size) to
> cur_offset so we remove the actual remaining part from ->bytes_may_use.
> 
> I validated this fix using my inject-error.py script
> 
> python inject-error.py -o should_fail_bio -t cache_save_setup -t \
> 	__btrfs_prealloc_file_range \
> 	-t insert_reserved_file_extent.constprop.0 \
> 	-r "-5" ./run-fsstress.sh
> 
> where run-fsstress.sh simply mounts and runs fsstress on a disk.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu

> ---
>  fs/btrfs/inode.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 84e649724549..747d860aedf6 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -9919,6 +9919,14 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode,
>  						  ins.offset, 0, 0, 0,
>  						  BTRFS_FILE_EXTENT_PREALLOC);
>  		if (ret) {
> +			/*
> +			 * We've reserved this space, and thus converted it from
> +			 * ->bytes_may_use to ->bytes_reserved, which we cleanup
> +			 * here.  We need to adjust cur_offset so that we only
> +			 * drop the ->bytes_may_use for the area we still have
> +			 * remaining in ->>bytes_may_use.
> +			 */
> +			cur_offset += ins.objectid;
>  			btrfs_free_reserved_extent(fs_info, ins.objectid,
>  						   ins.offset, 0);
>  			btrfs_abort_transaction(trans, ret);
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/4] btrfs: set fs_root = NULL on error
  2020-02-11 21:40 ` [PATCH 1/4] btrfs: set fs_root = NULL on error Josef Bacik
  2020-02-12  0:55   ` Qu Wenruo
@ 2020-02-13 10:05   ` Johannes Thumshirn
  2020-02-13 10:48   ` Nikolay Borisov
  2 siblings, 0 replies; 20+ messages in thread
From: Johannes Thumshirn @ 2020-02-13 10:05 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] btrfs: fix bytes_may_use underflow in prealloc error condtition
  2020-02-11 21:40 ` [PATCH 4/4] btrfs: fix bytes_may_use underflow in prealloc error condtition Josef Bacik
  2020-02-12  1:00   ` Qu Wenruo
@ 2020-02-13 10:17   ` Nikolay Borisov
  2020-02-13 15:29     ` Josef Bacik
  1 sibling, 1 reply; 20+ messages in thread
From: Nikolay Borisov @ 2020-02-13 10:17 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 11.02.20 г. 23:40 ч., Josef Bacik wrote:
> I hit the following warning while running my error injection stress testing
> 
> ------------[ cut here ]------------
> WARNING: CPU: 3 PID: 1453 at fs/btrfs/space-info.h:108 btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs]
> RIP: 0010:btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs]
> Call Trace:
> btrfs_free_reserved_data_space+0x4f/0x70 [btrfs]
> __btrfs_prealloc_file_range+0x378/0x470 [btrfs]
> elfcorehdr_read+0x40/0x40
> ? elfcorehdr_read+0x40/0x40
> ? btrfs_commit_transaction+0xca/0xa50 [btrfs]
> ? dput+0xb4/0x2a0
> ? btrfs_log_dentry_safe+0x55/0x70 [btrfs]
> ? btrfs_sync_file+0x30e/0x420 [btrfs]
> ? do_fsync+0x38/0x70
> ? __x64_sys_fdatasync+0x13/0x20
> ? do_syscall_64+0x5b/0x1b0
> ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
> ---[ end trace 70ccb5d0fe51151c ]---
> 
> This happens if we fail to insert our reserved file extent.  At this
> point we've already converted our reservation from ->bytes_may_use to
> ->bytes_reserved.  However once we break we will attempt to free
> everything from [cur_offset, end] from ->bytes_may_use, but our extent
> reservation will overlap part of this.
> 
> Fix this problem by adding ins.offset (our extent allocation size) to
> cur_offset so we remove the actual remaining part from ->bytes_may_use.
This contradicts the code, you are adding ins.objectid which is the
offset and not the size. This means either the code is buggy.
<snip>

>  		if (ret) {
> +			/*
> +			 * We've reserved this space, and thus converted it from
> +			 * ->bytes_may_use to ->bytes_reserved, which we cleanup
> +			 * here.  We need to adjust cur_offset so that we only
> +			 * drop the ->bytes_may_use for the area we still have
> +			 * remaining in ->>bytes_may_use.
> +			 */
> +			cur_offset += ins.objectid;
>  			btrfs_free_reserved_extent(fs_info, ins.objectid,
>  						   ins.offset, 0);
>  			btrfs_abort_transaction(trans, ret);
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/4] btrfs: handle logged extent failure properly
  2020-02-11 21:40 ` [PATCH 3/4] btrfs: handle logged extent failure properly Josef Bacik
  2020-02-12  0:58   ` Qu Wenruo
@ 2020-02-13 10:26   ` Nikolay Borisov
  2020-02-13 11:16   ` Johannes Thumshirn
  2 siblings, 0 replies; 20+ messages in thread
From: Nikolay Borisov @ 2020-02-13 10:26 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 11.02.20 г. 23:40 ч., Josef Bacik wrote:
> If we're allocating a logged extent we attempt to insert an extent
> record for the file extent directly.  We increase
> space_info->bytes_reserved, because the extent entry addition will call
> btrfs_update_block_group(), which will convert the ->bytes_reserved to
> ->bytes_used.  However if we fail at any point while inserting the
> extent entry we will bail and leave space on ->bytes_reserved, which
> will trigger a WARN_ON() on umount.  Fix this by pinning the space if we
> fail to insert, which is what happens in every other failure case that
> involves adding the extent entry.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/4] btrfs: do not check delayed items are empty for single trans cleanup
  2020-02-11 21:40 ` [PATCH 2/4] btrfs: do not check delayed items are empty for single trans cleanup Josef Bacik
  2020-02-12  0:57   ` Qu Wenruo
@ 2020-02-13 10:28   ` Nikolay Borisov
  2020-02-13 11:15   ` Johannes Thumshirn
  2 siblings, 0 replies; 20+ messages in thread
From: Nikolay Borisov @ 2020-02-13 10:28 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 11.02.20 г. 23:40 ч., Josef Bacik wrote:
> btrfs_assert_delayed_root_empty() will check if the delayed root is
> completely empty, but this is a fs wide check.  On cleanup we may have
> allowed other transactions to begin, for whatever reason, and thus the
> delayed root is not empty.  So remove this check from
> cleanup_one_transation().  This however can stay in
> btrfs_cleanup_transaction(), because it checks only after all of the
> transactions have been properly cleaned up, and thus is valid.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/4] btrfs: set fs_root = NULL on error
  2020-02-11 21:40 ` [PATCH 1/4] btrfs: set fs_root = NULL on error Josef Bacik
  2020-02-12  0:55   ` Qu Wenruo
  2020-02-13 10:05   ` Johannes Thumshirn
@ 2020-02-13 10:48   ` Nikolay Borisov
  2020-02-13 15:31     ` Josef Bacik
  2 siblings, 1 reply; 20+ messages in thread
From: Nikolay Borisov @ 2020-02-13 10:48 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 11.02.20 г. 23:40 ч., Josef Bacik wrote:
> While running my error injection script I hit a panic when we tried to
> clean up the fs_root when free'ing the fs_root.  This is because
> fs_info->fs_root == PTR_ERR(-EIO), which isn't great.  Fix this by
> setting fs_info->fs_root = NULL; if we fail to read the root.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>


While looking to see how ->fs_root (git grep "\->fs_root\W" fs/btrfs) is
used I realized we almost never query it through that member. It's
cleaned up via the btrfs_free_fs_roots which queries the root radix.
Given this I fail to see how the presence of a bogus value in
fs_info->fs_root would cause a crash (it's certainly wrong so your patch
per-se is fine). Can you provide an example call trace?


In any case :

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/4] btrfs: do not check delayed items are empty for single trans cleanup
  2020-02-11 21:40 ` [PATCH 2/4] btrfs: do not check delayed items are empty for single trans cleanup Josef Bacik
  2020-02-12  0:57   ` Qu Wenruo
  2020-02-13 10:28   ` Nikolay Borisov
@ 2020-02-13 11:15   ` Johannes Thumshirn
  2 siblings, 0 replies; 20+ messages in thread
From: Johannes Thumshirn @ 2020-02-13 11:15 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/4] btrfs: handle logged extent failure properly
  2020-02-11 21:40 ` [PATCH 3/4] btrfs: handle logged extent failure properly Josef Bacik
  2020-02-12  0:58   ` Qu Wenruo
  2020-02-13 10:26   ` Nikolay Borisov
@ 2020-02-13 11:16   ` Johannes Thumshirn
  2 siblings, 0 replies; 20+ messages in thread
From: Johannes Thumshirn @ 2020-02-13 11:16 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] btrfs: fix bytes_may_use underflow in prealloc error condtition
  2020-02-13 10:17   ` Nikolay Borisov
@ 2020-02-13 15:29     ` Josef Bacik
  0 siblings, 0 replies; 20+ messages in thread
From: Josef Bacik @ 2020-02-13 15:29 UTC (permalink / raw)
  To: Nikolay Borisov, linux-btrfs, kernel-team

On 2/13/20 5:17 AM, Nikolay Borisov wrote:
> 
> 
> On 11.02.20 г. 23:40 ч., Josef Bacik wrote:
>> I hit the following warning while running my error injection stress testing
>>
>> ------------[ cut here ]------------
>> WARNING: CPU: 3 PID: 1453 at fs/btrfs/space-info.h:108 btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs]
>> RIP: 0010:btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs]
>> Call Trace:
>> btrfs_free_reserved_data_space+0x4f/0x70 [btrfs]
>> __btrfs_prealloc_file_range+0x378/0x470 [btrfs]
>> elfcorehdr_read+0x40/0x40
>> ? elfcorehdr_read+0x40/0x40
>> ? btrfs_commit_transaction+0xca/0xa50 [btrfs]
>> ? dput+0xb4/0x2a0
>> ? btrfs_log_dentry_safe+0x55/0x70 [btrfs]
>> ? btrfs_sync_file+0x30e/0x420 [btrfs]
>> ? do_fsync+0x38/0x70
>> ? __x64_sys_fdatasync+0x13/0x20
>> ? do_syscall_64+0x5b/0x1b0
>> ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
>> ---[ end trace 70ccb5d0fe51151c ]---
>>
>> This happens if we fail to insert our reserved file extent.  At this
>> point we've already converted our reservation from ->bytes_may_use to
>> ->bytes_reserved.  However once we break we will attempt to free
>> everything from [cur_offset, end] from ->bytes_may_use, but our extent
>> reservation will overlap part of this.
>>
>> Fix this problem by adding ins.offset (our extent allocation size) to
>> cur_offset so we remove the actual remaining part from ->bytes_may_use.
> This contradicts the code, you are adding ins.objectid which is the
> offset and not the size. This means either the code is buggy.

Ooops you're right, I was getting lucky because we're making the whole 
allocation at once, and ins.objectid was past extent_end so we ended up doing 
the right thing, but for the wrong reasons.  In fact I need to adjust this for 
the other error condition, so I'll fix this up.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/4] btrfs: set fs_root = NULL on error
  2020-02-13 10:48   ` Nikolay Borisov
@ 2020-02-13 15:31     ` Josef Bacik
  2020-02-13 15:32       ` Nikolay Borisov
  0 siblings, 1 reply; 20+ messages in thread
From: Josef Bacik @ 2020-02-13 15:31 UTC (permalink / raw)
  To: Nikolay Borisov, linux-btrfs, kernel-team

On 2/13/20 5:48 AM, Nikolay Borisov wrote:
> 
> 
> On 11.02.20 г. 23:40 ч., Josef Bacik wrote:
>> While running my error injection script I hit a panic when we tried to
>> clean up the fs_root when free'ing the fs_root.  This is because
>> fs_info->fs_root == PTR_ERR(-EIO), which isn't great.  Fix this by
>> setting fs_info->fs_root = NULL; if we fail to read the root.
>>
>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> 
> 
> While looking to see how ->fs_root (git grep "\->fs_root\W" fs/btrfs) is
> used I realized we almost never query it through that member. It's
> cleaned up via the btrfs_free_fs_roots which queries the root radix.
> Given this I fail to see how the presence of a bogus value in
> fs_info->fs_root would cause a crash (it's certainly wrong so your patch
> per-se is fine). Can you provide an example call trace?
> 

We do a btrfs_put_root(fs_info->fs_root); in btrfs_free_fs_info.  There's for 
sure an argument to be made for getting rid of fs_info->fs_root, and just using 
the radix lookup.  Once all of my root ref patches are merged I'll take a run at 
that.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/4] btrfs: set fs_root = NULL on error
  2020-02-13 15:31     ` Josef Bacik
@ 2020-02-13 15:32       ` Nikolay Borisov
  0 siblings, 0 replies; 20+ messages in thread
From: Nikolay Borisov @ 2020-02-13 15:32 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 13.02.20 г. 17:31 ч., Josef Bacik wrote:
> 
> We do a btrfs_put_root(fs_info->fs_root); in btrfs_free_fs_info. 
> There's for sure an argument to be made for getting rid of
> fs_info->fs_root, and just using the radix lookup.  Once all of my root
> ref patches are merged I'll take a run at that.  Thanks,


This makes sense, I was looking at linux master which don't have your
patches, hence I wasn't seeing this. Fair point.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 4/4] btrfs: fix bytes_may_use underflow in prealloc error condtition
  2020-02-13 15:47 [PATCH 0/4][v2] Error condition failure fixes Josef Bacik
@ 2020-02-13 15:47 ` Josef Bacik
  0 siblings, 0 replies; 20+ messages in thread
From: Josef Bacik @ 2020-02-13 15:47 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Qu Wenruo

I hit the following warning while running my error injection stress testing

------------[ cut here ]------------
WARNING: CPU: 3 PID: 1453 at fs/btrfs/space-info.h:108 btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs]
RIP: 0010:btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs]
Call Trace:
btrfs_free_reserved_data_space+0x4f/0x70 [btrfs]
__btrfs_prealloc_file_range+0x378/0x470 [btrfs]
elfcorehdr_read+0x40/0x40
? elfcorehdr_read+0x40/0x40
? btrfs_commit_transaction+0xca/0xa50 [btrfs]
? dput+0xb4/0x2a0
? btrfs_log_dentry_safe+0x55/0x70 [btrfs]
? btrfs_sync_file+0x30e/0x420 [btrfs]
? do_fsync+0x38/0x70
? __x64_sys_fdatasync+0x13/0x20
? do_syscall_64+0x5b/0x1b0
? entry_SYSCALL_64_after_hwframe+0x44/0xa9
---[ end trace 70ccb5d0fe51151c ]---

This happens if we fail to insert our reserved file extent.  At this
point we've already converted our reservation from ->bytes_may_use to
->bytes_reserved.  However once we break we will attempt to free
everything from [cur_offset, end] from ->bytes_may_use, but our extent
reservation will overlap part of this.

Fix this problem by adding ins.offset (our extent allocation size) to
cur_offset so we remove the actual remaining part from ->bytes_may_use.

I validated this fix using my inject-error.py script

python inject-error.py -o should_fail_bio -t cache_save_setup -t \
	__btrfs_prealloc_file_range \
	-t insert_reserved_file_extent.constprop.0 \
	-r "-5" ./run-fsstress.sh

where run-fsstress.sh simply mounts and runs fsstress on a disk.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/inode.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 84e649724549..ffc6fcfe805c 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -9876,6 +9876,7 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode,
 	struct btrfs_root *root = BTRFS_I(inode)->root;
 	struct btrfs_key ins;
 	u64 cur_offset = start;
+	u64 clear_offset = start;
 	u64 i_size;
 	u64 cur_bytes;
 	u64 last_alloc = (u64)-1;
@@ -9910,6 +9911,15 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode,
 				btrfs_end_transaction(trans);
 			break;
 		}
+
+		/*
+		 * We've reserved this space, and thus converted it from
+		 * ->bytes_may_use to ->bytes_reserved.  Any error that happens
+		 * from here on out we will only need to clear our reservation
+		 * for the remaining unreserved area, so advance our
+		 * clear_offset by our extent size.
+		 */
+		clear_offset += ins.offset;
 		btrfs_dec_block_group_reservations(fs_info, ins.objectid);
 
 		last_alloc = ins.offset;
@@ -9989,9 +9999,9 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode,
 		if (own_trans)
 			btrfs_end_transaction(trans);
 	}
-	if (cur_offset < end)
-		btrfs_free_reserved_data_space(inode, NULL, cur_offset,
-			end - cur_offset + 1);
+	if (clear_offset < end)
+		btrfs_free_reserved_data_space(inode, NULL, clear_offset,
+			end - clear_offset + 1);
 	return ret;
 }
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2020-02-13 15:47 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-11 21:40 [PATCH 0/4] Error condition failure fixes Josef Bacik
2020-02-11 21:40 ` [PATCH 1/4] btrfs: set fs_root = NULL on error Josef Bacik
2020-02-12  0:55   ` Qu Wenruo
2020-02-13 10:05   ` Johannes Thumshirn
2020-02-13 10:48   ` Nikolay Borisov
2020-02-13 15:31     ` Josef Bacik
2020-02-13 15:32       ` Nikolay Borisov
2020-02-11 21:40 ` [PATCH 2/4] btrfs: do not check delayed items are empty for single trans cleanup Josef Bacik
2020-02-12  0:57   ` Qu Wenruo
2020-02-13 10:28   ` Nikolay Borisov
2020-02-13 11:15   ` Johannes Thumshirn
2020-02-11 21:40 ` [PATCH 3/4] btrfs: handle logged extent failure properly Josef Bacik
2020-02-12  0:58   ` Qu Wenruo
2020-02-13 10:26   ` Nikolay Borisov
2020-02-13 11:16   ` Johannes Thumshirn
2020-02-11 21:40 ` [PATCH 4/4] btrfs: fix bytes_may_use underflow in prealloc error condtition Josef Bacik
2020-02-12  1:00   ` Qu Wenruo
2020-02-13 10:17   ` Nikolay Borisov
2020-02-13 15:29     ` Josef Bacik
2020-02-13 15:47 [PATCH 0/4][v2] Error condition failure fixes Josef Bacik
2020-02-13 15:47 ` [PATCH 4/4] btrfs: fix bytes_may_use underflow in prealloc error condtition Josef Bacik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).