* [PATCH 0/4] Error condition failure fixes
@ 2020-02-11 21:40 Josef Bacik
2020-02-11 21:40 ` [PATCH 1/4] btrfs: set fs_root = NULL on error Josef Bacik
` (3 more replies)
0 siblings, 4 replies; 20+ messages in thread
From: Josef Bacik @ 2020-02-11 21:40 UTC (permalink / raw)
To: linux-btrfs, kernel-team
I've been running my bpf error injection stress script and been fixing what has
fallen out. I don't think I've fixed everything yet, but to reduce the noise
from Dave's testing here's the current set of fixes I have. These are based on
misc-next from late last week, but should apply cleanly to the recent set.
These aren't super important, but will cut down on the noise from things like
generic/019 and generic/475. Thanks,
Josef
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 1/4] btrfs: set fs_root = NULL on error
2020-02-11 21:40 [PATCH 0/4] Error condition failure fixes Josef Bacik
@ 2020-02-11 21:40 ` Josef Bacik
2020-02-12 0:55 ` Qu Wenruo
` (2 more replies)
2020-02-11 21:40 ` [PATCH 2/4] btrfs: do not check delayed items are empty for single trans cleanup Josef Bacik
` (2 subsequent siblings)
3 siblings, 3 replies; 20+ messages in thread
From: Josef Bacik @ 2020-02-11 21:40 UTC (permalink / raw)
To: linux-btrfs, kernel-team
While running my error injection script I hit a panic when we tried to
clean up the fs_root when free'ing the fs_root. This is because
fs_info->fs_root == PTR_ERR(-EIO), which isn't great. Fix this by
setting fs_info->fs_root = NULL; if we fail to read the root.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
fs/btrfs/disk-io.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index eb441fa3711b..5b6140482cef 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3260,6 +3260,7 @@ int __cold open_ctree(struct super_block *sb,
if (IS_ERR(fs_info->fs_root)) {
err = PTR_ERR(fs_info->fs_root);
btrfs_warn(fs_info, "failed to read fs tree: %d", err);
+ fs_info->fs_root = NULL;
goto fail_qgroup;
}
--
2.24.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 2/4] btrfs: do not check delayed items are empty for single trans cleanup
2020-02-11 21:40 [PATCH 0/4] Error condition failure fixes Josef Bacik
2020-02-11 21:40 ` [PATCH 1/4] btrfs: set fs_root = NULL on error Josef Bacik
@ 2020-02-11 21:40 ` Josef Bacik
2020-02-12 0:57 ` Qu Wenruo
` (2 more replies)
2020-02-11 21:40 ` [PATCH 3/4] btrfs: handle logged extent failure properly Josef Bacik
2020-02-11 21:40 ` [PATCH 4/4] btrfs: fix bytes_may_use underflow in prealloc error condtition Josef Bacik
3 siblings, 3 replies; 20+ messages in thread
From: Josef Bacik @ 2020-02-11 21:40 UTC (permalink / raw)
To: linux-btrfs, kernel-team
btrfs_assert_delayed_root_empty() will check if the delayed root is
completely empty, but this is a fs wide check. On cleanup we may have
allowed other transactions to begin, for whatever reason, and thus the
delayed root is not empty. So remove this check from
cleanup_one_transation(). This however can stay in
btrfs_cleanup_transaction(), because it checks only after all of the
transactions have been properly cleaned up, and thus is valid.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
fs/btrfs/disk-io.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 5b6140482cef..601ed3335cf6 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -4543,7 +4543,6 @@ void btrfs_cleanup_one_transaction(struct btrfs_transaction *cur_trans,
wake_up(&fs_info->transaction_wait);
btrfs_destroy_delayed_inodes(fs_info);
- btrfs_assert_delayed_root_empty(fs_info);
btrfs_destroy_marked_extents(fs_info, &cur_trans->dirty_pages,
EXTENT_DIRTY);
--
2.24.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 3/4] btrfs: handle logged extent failure properly
2020-02-11 21:40 [PATCH 0/4] Error condition failure fixes Josef Bacik
2020-02-11 21:40 ` [PATCH 1/4] btrfs: set fs_root = NULL on error Josef Bacik
2020-02-11 21:40 ` [PATCH 2/4] btrfs: do not check delayed items are empty for single trans cleanup Josef Bacik
@ 2020-02-11 21:40 ` Josef Bacik
2020-02-12 0:58 ` Qu Wenruo
` (2 more replies)
2020-02-11 21:40 ` [PATCH 4/4] btrfs: fix bytes_may_use underflow in prealloc error condtition Josef Bacik
3 siblings, 3 replies; 20+ messages in thread
From: Josef Bacik @ 2020-02-11 21:40 UTC (permalink / raw)
To: linux-btrfs, kernel-team
If we're allocating a logged extent we attempt to insert an extent
record for the file extent directly. We increase
space_info->bytes_reserved, because the extent entry addition will call
btrfs_update_block_group(), which will convert the ->bytes_reserved to
->bytes_used. However if we fail at any point while inserting the
extent entry we will bail and leave space on ->bytes_reserved, which
will trigger a WARN_ON() on umount. Fix this by pinning the space if we
fail to insert, which is what happens in every other failure case that
involves adding the extent entry.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
fs/btrfs/extent-tree.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index c43acb329fa6..2b4c3ca5e651 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -4430,6 +4430,8 @@ int btrfs_alloc_logged_file_extent(struct btrfs_trans_handle *trans,
ret = alloc_reserved_file_extent(trans, 0, root_objectid, 0, owner,
offset, ins, 1);
+ if (ret)
+ btrfs_pin_extent(fs_info, ins->objectid, ins->offset, 1);
btrfs_put_block_group(block_group);
return ret;
}
--
2.24.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 4/4] btrfs: fix bytes_may_use underflow in prealloc error condtition
2020-02-11 21:40 [PATCH 0/4] Error condition failure fixes Josef Bacik
` (2 preceding siblings ...)
2020-02-11 21:40 ` [PATCH 3/4] btrfs: handle logged extent failure properly Josef Bacik
@ 2020-02-11 21:40 ` Josef Bacik
2020-02-12 1:00 ` Qu Wenruo
2020-02-13 10:17 ` Nikolay Borisov
3 siblings, 2 replies; 20+ messages in thread
From: Josef Bacik @ 2020-02-11 21:40 UTC (permalink / raw)
To: linux-btrfs, kernel-team
I hit the following warning while running my error injection stress testing
------------[ cut here ]------------
WARNING: CPU: 3 PID: 1453 at fs/btrfs/space-info.h:108 btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs]
RIP: 0010:btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs]
Call Trace:
btrfs_free_reserved_data_space+0x4f/0x70 [btrfs]
__btrfs_prealloc_file_range+0x378/0x470 [btrfs]
elfcorehdr_read+0x40/0x40
? elfcorehdr_read+0x40/0x40
? btrfs_commit_transaction+0xca/0xa50 [btrfs]
? dput+0xb4/0x2a0
? btrfs_log_dentry_safe+0x55/0x70 [btrfs]
? btrfs_sync_file+0x30e/0x420 [btrfs]
? do_fsync+0x38/0x70
? __x64_sys_fdatasync+0x13/0x20
? do_syscall_64+0x5b/0x1b0
? entry_SYSCALL_64_after_hwframe+0x44/0xa9
---[ end trace 70ccb5d0fe51151c ]---
This happens if we fail to insert our reserved file extent. At this
point we've already converted our reservation from ->bytes_may_use to
->bytes_reserved. However once we break we will attempt to free
everything from [cur_offset, end] from ->bytes_may_use, but our extent
reservation will overlap part of this.
Fix this problem by adding ins.offset (our extent allocation size) to
cur_offset so we remove the actual remaining part from ->bytes_may_use.
I validated this fix using my inject-error.py script
python inject-error.py -o should_fail_bio -t cache_save_setup -t \
__btrfs_prealloc_file_range \
-t insert_reserved_file_extent.constprop.0 \
-r "-5" ./run-fsstress.sh
where run-fsstress.sh simply mounts and runs fsstress on a disk.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
fs/btrfs/inode.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 84e649724549..747d860aedf6 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -9919,6 +9919,14 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode,
ins.offset, 0, 0, 0,
BTRFS_FILE_EXTENT_PREALLOC);
if (ret) {
+ /*
+ * We've reserved this space, and thus converted it from
+ * ->bytes_may_use to ->bytes_reserved, which we cleanup
+ * here. We need to adjust cur_offset so that we only
+ * drop the ->bytes_may_use for the area we still have
+ * remaining in ->>bytes_may_use.
+ */
+ cur_offset += ins.objectid;
btrfs_free_reserved_extent(fs_info, ins.objectid,
ins.offset, 0);
btrfs_abort_transaction(trans, ret);
--
2.24.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 1/4] btrfs: set fs_root = NULL on error
2020-02-11 21:40 ` [PATCH 1/4] btrfs: set fs_root = NULL on error Josef Bacik
@ 2020-02-12 0:55 ` Qu Wenruo
2020-02-13 10:05 ` Johannes Thumshirn
2020-02-13 10:48 ` Nikolay Borisov
2 siblings, 0 replies; 20+ messages in thread
From: Qu Wenruo @ 2020-02-12 0:55 UTC (permalink / raw)
To: Josef Bacik, linux-btrfs, kernel-team
[-- Attachment #1.1: Type: text/plain, Size: 1228 bytes --]
On 2020/2/12 上午5:40, Josef Bacik wrote:
> While running my error injection script I hit a panic when we tried to
> clean up the fs_root when free'ing the fs_root. This is because
> fs_info->fs_root == PTR_ERR(-EIO), which isn't great. Fix this by
> setting fs_info->fs_root = NULL; if we fail to read the root.
>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Just one off-topic idea, can we have test cases in fstests to do
specific error injection test?
For your fix, we can inject ENOMEM error with call chain
btrfs_read_fs_root_no_name()->open_ctree() to get a 100% reproducible
test, which looks to be a solid test case.
Thanks,
Qu
> ---
> fs/btrfs/disk-io.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index eb441fa3711b..5b6140482cef 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -3260,6 +3260,7 @@ int __cold open_ctree(struct super_block *sb,
> if (IS_ERR(fs_info->fs_root)) {
> err = PTR_ERR(fs_info->fs_root);
> btrfs_warn(fs_info, "failed to read fs tree: %d", err);
> + fs_info->fs_root = NULL;
> goto fail_qgroup;
> }
>
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/4] btrfs: do not check delayed items are empty for single trans cleanup
2020-02-11 21:40 ` [PATCH 2/4] btrfs: do not check delayed items are empty for single trans cleanup Josef Bacik
@ 2020-02-12 0:57 ` Qu Wenruo
2020-02-13 10:28 ` Nikolay Borisov
2020-02-13 11:15 ` Johannes Thumshirn
2 siblings, 0 replies; 20+ messages in thread
From: Qu Wenruo @ 2020-02-12 0:57 UTC (permalink / raw)
To: Josef Bacik, linux-btrfs, kernel-team
[-- Attachment #1.1: Type: text/plain, Size: 1401 bytes --]
On 2020/2/12 上午5:40, Josef Bacik wrote:
> btrfs_assert_delayed_root_empty() will check if the delayed root is
> completely empty, but this is a fs wide check. On cleanup we may have
> allowed other transactions to begin, for whatever reason, and thus the
> delayed root is not empty. So remove this check from
> cleanup_one_transation(). This however can stay in
> btrfs_cleanup_transaction(), because it checks only after all of the
> transactions have been properly cleaned up, and thus is valid.
>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Just a nitpick, to allow other user to verify the fix, would you mind to
provide a specific reproducer?
Like the error injection (I guess it's still memory allocation failure),
the call chain.
Thanks,
Qu
> ---
> fs/btrfs/disk-io.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 5b6140482cef..601ed3335cf6 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -4543,7 +4543,6 @@ void btrfs_cleanup_one_transaction(struct btrfs_transaction *cur_trans,
> wake_up(&fs_info->transaction_wait);
>
> btrfs_destroy_delayed_inodes(fs_info);
> - btrfs_assert_delayed_root_empty(fs_info);
>
> btrfs_destroy_marked_extents(fs_info, &cur_trans->dirty_pages,
> EXTENT_DIRTY);
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 3/4] btrfs: handle logged extent failure properly
2020-02-11 21:40 ` [PATCH 3/4] btrfs: handle logged extent failure properly Josef Bacik
@ 2020-02-12 0:58 ` Qu Wenruo
2020-02-13 10:26 ` Nikolay Borisov
2020-02-13 11:16 ` Johannes Thumshirn
2 siblings, 0 replies; 20+ messages in thread
From: Qu Wenruo @ 2020-02-12 0:58 UTC (permalink / raw)
To: Josef Bacik, linux-btrfs, kernel-team
[-- Attachment #1.1: Type: text/plain, Size: 1364 bytes --]
On 2020/2/12 上午5:40, Josef Bacik wrote:
> If we're allocating a logged extent we attempt to insert an extent
> record for the file extent directly. We increase
> space_info->bytes_reserved, because the extent entry addition will call
> btrfs_update_block_group(), which will convert the ->bytes_reserved to
> ->bytes_used. However if we fail at any point while inserting the
> extent entry we will bail and leave space on ->bytes_reserved, which
> will trigger a WARN_ON() on umount. Fix this by pinning the space if we
> fail to insert, which is what happens in every other failure case that
> involves adding the extent entry.
>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Thanks,
Qu
> ---
> fs/btrfs/extent-tree.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index c43acb329fa6..2b4c3ca5e651 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -4430,6 +4430,8 @@ int btrfs_alloc_logged_file_extent(struct btrfs_trans_handle *trans,
>
> ret = alloc_reserved_file_extent(trans, 0, root_objectid, 0, owner,
> offset, ins, 1);
> + if (ret)
> + btrfs_pin_extent(fs_info, ins->objectid, ins->offset, 1);
> btrfs_put_block_group(block_group);
> return ret;
> }
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] btrfs: fix bytes_may_use underflow in prealloc error condtition
2020-02-11 21:40 ` [PATCH 4/4] btrfs: fix bytes_may_use underflow in prealloc error condtition Josef Bacik
@ 2020-02-12 1:00 ` Qu Wenruo
2020-02-13 10:17 ` Nikolay Borisov
1 sibling, 0 replies; 20+ messages in thread
From: Qu Wenruo @ 2020-02-12 1:00 UTC (permalink / raw)
To: Josef Bacik, linux-btrfs, kernel-team
[-- Attachment #1.1: Type: text/plain, Size: 2687 bytes --]
On 2020/2/12 上午5:40, Josef Bacik wrote:
> I hit the following warning while running my error injection stress testing
>
> ------------[ cut here ]------------
> WARNING: CPU: 3 PID: 1453 at fs/btrfs/space-info.h:108 btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs]
> RIP: 0010:btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs]
> Call Trace:
> btrfs_free_reserved_data_space+0x4f/0x70 [btrfs]
> __btrfs_prealloc_file_range+0x378/0x470 [btrfs]
> elfcorehdr_read+0x40/0x40
> ? elfcorehdr_read+0x40/0x40
> ? btrfs_commit_transaction+0xca/0xa50 [btrfs]
> ? dput+0xb4/0x2a0
> ? btrfs_log_dentry_safe+0x55/0x70 [btrfs]
> ? btrfs_sync_file+0x30e/0x420 [btrfs]
> ? do_fsync+0x38/0x70
> ? __x64_sys_fdatasync+0x13/0x20
> ? do_syscall_64+0x5b/0x1b0
> ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
> ---[ end trace 70ccb5d0fe51151c ]---
>
> This happens if we fail to insert our reserved file extent. At this
> point we've already converted our reservation from ->bytes_may_use to
> ->bytes_reserved. However once we break we will attempt to free
> everything from [cur_offset, end] from ->bytes_may_use, but our extent
> reservation will overlap part of this.
>
> Fix this problem by adding ins.offset (our extent allocation size) to
> cur_offset so we remove the actual remaining part from ->bytes_may_use.
>
> I validated this fix using my inject-error.py script
>
> python inject-error.py -o should_fail_bio -t cache_save_setup -t \
> __btrfs_prealloc_file_range \
> -t insert_reserved_file_extent.constprop.0 \
> -r "-5" ./run-fsstress.sh
>
> where run-fsstress.sh simply mounts and runs fsstress on a disk.
>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Thanks,
Qu
> ---
> fs/btrfs/inode.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 84e649724549..747d860aedf6 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -9919,6 +9919,14 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode,
> ins.offset, 0, 0, 0,
> BTRFS_FILE_EXTENT_PREALLOC);
> if (ret) {
> + /*
> + * We've reserved this space, and thus converted it from
> + * ->bytes_may_use to ->bytes_reserved, which we cleanup
> + * here. We need to adjust cur_offset so that we only
> + * drop the ->bytes_may_use for the area we still have
> + * remaining in ->>bytes_may_use.
> + */
> + cur_offset += ins.objectid;
> btrfs_free_reserved_extent(fs_info, ins.objectid,
> ins.offset, 0);
> btrfs_abort_transaction(trans, ret);
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/4] btrfs: set fs_root = NULL on error
2020-02-11 21:40 ` [PATCH 1/4] btrfs: set fs_root = NULL on error Josef Bacik
2020-02-12 0:55 ` Qu Wenruo
@ 2020-02-13 10:05 ` Johannes Thumshirn
2020-02-13 10:48 ` Nikolay Borisov
2 siblings, 0 replies; 20+ messages in thread
From: Johannes Thumshirn @ 2020-02-13 10:05 UTC (permalink / raw)
To: Josef Bacik, linux-btrfs, kernel-team
Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] btrfs: fix bytes_may_use underflow in prealloc error condtition
2020-02-11 21:40 ` [PATCH 4/4] btrfs: fix bytes_may_use underflow in prealloc error condtition Josef Bacik
2020-02-12 1:00 ` Qu Wenruo
@ 2020-02-13 10:17 ` Nikolay Borisov
2020-02-13 15:29 ` Josef Bacik
1 sibling, 1 reply; 20+ messages in thread
From: Nikolay Borisov @ 2020-02-13 10:17 UTC (permalink / raw)
To: Josef Bacik, linux-btrfs, kernel-team
On 11.02.20 г. 23:40 ч., Josef Bacik wrote:
> I hit the following warning while running my error injection stress testing
>
> ------------[ cut here ]------------
> WARNING: CPU: 3 PID: 1453 at fs/btrfs/space-info.h:108 btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs]
> RIP: 0010:btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs]
> Call Trace:
> btrfs_free_reserved_data_space+0x4f/0x70 [btrfs]
> __btrfs_prealloc_file_range+0x378/0x470 [btrfs]
> elfcorehdr_read+0x40/0x40
> ? elfcorehdr_read+0x40/0x40
> ? btrfs_commit_transaction+0xca/0xa50 [btrfs]
> ? dput+0xb4/0x2a0
> ? btrfs_log_dentry_safe+0x55/0x70 [btrfs]
> ? btrfs_sync_file+0x30e/0x420 [btrfs]
> ? do_fsync+0x38/0x70
> ? __x64_sys_fdatasync+0x13/0x20
> ? do_syscall_64+0x5b/0x1b0
> ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
> ---[ end trace 70ccb5d0fe51151c ]---
>
> This happens if we fail to insert our reserved file extent. At this
> point we've already converted our reservation from ->bytes_may_use to
> ->bytes_reserved. However once we break we will attempt to free
> everything from [cur_offset, end] from ->bytes_may_use, but our extent
> reservation will overlap part of this.
>
> Fix this problem by adding ins.offset (our extent allocation size) to
> cur_offset so we remove the actual remaining part from ->bytes_may_use.
This contradicts the code, you are adding ins.objectid which is the
offset and not the size. This means either the code is buggy.
<snip>
> if (ret) {
> + /*
> + * We've reserved this space, and thus converted it from
> + * ->bytes_may_use to ->bytes_reserved, which we cleanup
> + * here. We need to adjust cur_offset so that we only
> + * drop the ->bytes_may_use for the area we still have
> + * remaining in ->>bytes_may_use.
> + */
> + cur_offset += ins.objectid;
> btrfs_free_reserved_extent(fs_info, ins.objectid,
> ins.offset, 0);
> btrfs_abort_transaction(trans, ret);
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 3/4] btrfs: handle logged extent failure properly
2020-02-11 21:40 ` [PATCH 3/4] btrfs: handle logged extent failure properly Josef Bacik
2020-02-12 0:58 ` Qu Wenruo
@ 2020-02-13 10:26 ` Nikolay Borisov
2020-02-13 11:16 ` Johannes Thumshirn
2 siblings, 0 replies; 20+ messages in thread
From: Nikolay Borisov @ 2020-02-13 10:26 UTC (permalink / raw)
To: Josef Bacik, linux-btrfs, kernel-team
On 11.02.20 г. 23:40 ч., Josef Bacik wrote:
> If we're allocating a logged extent we attempt to insert an extent
> record for the file extent directly. We increase
> space_info->bytes_reserved, because the extent entry addition will call
> btrfs_update_block_group(), which will convert the ->bytes_reserved to
> ->bytes_used. However if we fail at any point while inserting the
> extent entry we will bail and leave space on ->bytes_reserved, which
> will trigger a WARN_ON() on umount. Fix this by pinning the space if we
> fail to insert, which is what happens in every other failure case that
> involves adding the extent entry.
>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/4] btrfs: do not check delayed items are empty for single trans cleanup
2020-02-11 21:40 ` [PATCH 2/4] btrfs: do not check delayed items are empty for single trans cleanup Josef Bacik
2020-02-12 0:57 ` Qu Wenruo
@ 2020-02-13 10:28 ` Nikolay Borisov
2020-02-13 11:15 ` Johannes Thumshirn
2 siblings, 0 replies; 20+ messages in thread
From: Nikolay Borisov @ 2020-02-13 10:28 UTC (permalink / raw)
To: Josef Bacik, linux-btrfs, kernel-team
On 11.02.20 г. 23:40 ч., Josef Bacik wrote:
> btrfs_assert_delayed_root_empty() will check if the delayed root is
> completely empty, but this is a fs wide check. On cleanup we may have
> allowed other transactions to begin, for whatever reason, and thus the
> delayed root is not empty. So remove this check from
> cleanup_one_transation(). This however can stay in
> btrfs_cleanup_transaction(), because it checks only after all of the
> transactions have been properly cleaned up, and thus is valid.
>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/4] btrfs: set fs_root = NULL on error
2020-02-11 21:40 ` [PATCH 1/4] btrfs: set fs_root = NULL on error Josef Bacik
2020-02-12 0:55 ` Qu Wenruo
2020-02-13 10:05 ` Johannes Thumshirn
@ 2020-02-13 10:48 ` Nikolay Borisov
2020-02-13 15:31 ` Josef Bacik
2 siblings, 1 reply; 20+ messages in thread
From: Nikolay Borisov @ 2020-02-13 10:48 UTC (permalink / raw)
To: Josef Bacik, linux-btrfs, kernel-team
On 11.02.20 г. 23:40 ч., Josef Bacik wrote:
> While running my error injection script I hit a panic when we tried to
> clean up the fs_root when free'ing the fs_root. This is because
> fs_info->fs_root == PTR_ERR(-EIO), which isn't great. Fix this by
> setting fs_info->fs_root = NULL; if we fail to read the root.
>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
While looking to see how ->fs_root (git grep "\->fs_root\W" fs/btrfs) is
used I realized we almost never query it through that member. It's
cleaned up via the btrfs_free_fs_roots which queries the root radix.
Given this I fail to see how the presence of a bogus value in
fs_info->fs_root would cause a crash (it's certainly wrong so your patch
per-se is fine). Can you provide an example call trace?
In any case :
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/4] btrfs: do not check delayed items are empty for single trans cleanup
2020-02-11 21:40 ` [PATCH 2/4] btrfs: do not check delayed items are empty for single trans cleanup Josef Bacik
2020-02-12 0:57 ` Qu Wenruo
2020-02-13 10:28 ` Nikolay Borisov
@ 2020-02-13 11:15 ` Johannes Thumshirn
2 siblings, 0 replies; 20+ messages in thread
From: Johannes Thumshirn @ 2020-02-13 11:15 UTC (permalink / raw)
To: Josef Bacik, linux-btrfs, kernel-team
Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 3/4] btrfs: handle logged extent failure properly
2020-02-11 21:40 ` [PATCH 3/4] btrfs: handle logged extent failure properly Josef Bacik
2020-02-12 0:58 ` Qu Wenruo
2020-02-13 10:26 ` Nikolay Borisov
@ 2020-02-13 11:16 ` Johannes Thumshirn
2 siblings, 0 replies; 20+ messages in thread
From: Johannes Thumshirn @ 2020-02-13 11:16 UTC (permalink / raw)
To: Josef Bacik, linux-btrfs, kernel-team
Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] btrfs: fix bytes_may_use underflow in prealloc error condtition
2020-02-13 10:17 ` Nikolay Borisov
@ 2020-02-13 15:29 ` Josef Bacik
0 siblings, 0 replies; 20+ messages in thread
From: Josef Bacik @ 2020-02-13 15:29 UTC (permalink / raw)
To: Nikolay Borisov, linux-btrfs, kernel-team
On 2/13/20 5:17 AM, Nikolay Borisov wrote:
>
>
> On 11.02.20 г. 23:40 ч., Josef Bacik wrote:
>> I hit the following warning while running my error injection stress testing
>>
>> ------------[ cut here ]------------
>> WARNING: CPU: 3 PID: 1453 at fs/btrfs/space-info.h:108 btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs]
>> RIP: 0010:btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs]
>> Call Trace:
>> btrfs_free_reserved_data_space+0x4f/0x70 [btrfs]
>> __btrfs_prealloc_file_range+0x378/0x470 [btrfs]
>> elfcorehdr_read+0x40/0x40
>> ? elfcorehdr_read+0x40/0x40
>> ? btrfs_commit_transaction+0xca/0xa50 [btrfs]
>> ? dput+0xb4/0x2a0
>> ? btrfs_log_dentry_safe+0x55/0x70 [btrfs]
>> ? btrfs_sync_file+0x30e/0x420 [btrfs]
>> ? do_fsync+0x38/0x70
>> ? __x64_sys_fdatasync+0x13/0x20
>> ? do_syscall_64+0x5b/0x1b0
>> ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
>> ---[ end trace 70ccb5d0fe51151c ]---
>>
>> This happens if we fail to insert our reserved file extent. At this
>> point we've already converted our reservation from ->bytes_may_use to
>> ->bytes_reserved. However once we break we will attempt to free
>> everything from [cur_offset, end] from ->bytes_may_use, but our extent
>> reservation will overlap part of this.
>>
>> Fix this problem by adding ins.offset (our extent allocation size) to
>> cur_offset so we remove the actual remaining part from ->bytes_may_use.
> This contradicts the code, you are adding ins.objectid which is the
> offset and not the size. This means either the code is buggy.
Ooops you're right, I was getting lucky because we're making the whole
allocation at once, and ins.objectid was past extent_end so we ended up doing
the right thing, but for the wrong reasons. In fact I need to adjust this for
the other error condition, so I'll fix this up. Thanks,
Josef
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/4] btrfs: set fs_root = NULL on error
2020-02-13 10:48 ` Nikolay Borisov
@ 2020-02-13 15:31 ` Josef Bacik
2020-02-13 15:32 ` Nikolay Borisov
0 siblings, 1 reply; 20+ messages in thread
From: Josef Bacik @ 2020-02-13 15:31 UTC (permalink / raw)
To: Nikolay Borisov, linux-btrfs, kernel-team
On 2/13/20 5:48 AM, Nikolay Borisov wrote:
>
>
> On 11.02.20 г. 23:40 ч., Josef Bacik wrote:
>> While running my error injection script I hit a panic when we tried to
>> clean up the fs_root when free'ing the fs_root. This is because
>> fs_info->fs_root == PTR_ERR(-EIO), which isn't great. Fix this by
>> setting fs_info->fs_root = NULL; if we fail to read the root.
>>
>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
>
>
> While looking to see how ->fs_root (git grep "\->fs_root\W" fs/btrfs) is
> used I realized we almost never query it through that member. It's
> cleaned up via the btrfs_free_fs_roots which queries the root radix.
> Given this I fail to see how the presence of a bogus value in
> fs_info->fs_root would cause a crash (it's certainly wrong so your patch
> per-se is fine). Can you provide an example call trace?
>
We do a btrfs_put_root(fs_info->fs_root); in btrfs_free_fs_info. There's for
sure an argument to be made for getting rid of fs_info->fs_root, and just using
the radix lookup. Once all of my root ref patches are merged I'll take a run at
that. Thanks,
Josef
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/4] btrfs: set fs_root = NULL on error
2020-02-13 15:31 ` Josef Bacik
@ 2020-02-13 15:32 ` Nikolay Borisov
0 siblings, 0 replies; 20+ messages in thread
From: Nikolay Borisov @ 2020-02-13 15:32 UTC (permalink / raw)
To: Josef Bacik, linux-btrfs, kernel-team
On 13.02.20 г. 17:31 ч., Josef Bacik wrote:
>
> We do a btrfs_put_root(fs_info->fs_root); in btrfs_free_fs_info.
> There's for sure an argument to be made for getting rid of
> fs_info->fs_root, and just using the radix lookup. Once all of my root
> ref patches are merged I'll take a run at that. Thanks,
This makes sense, I was looking at linux master which don't have your
patches, hence I wasn't seeing this. Fair point.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 4/4] btrfs: fix bytes_may_use underflow in prealloc error condtition
2020-02-13 15:47 [PATCH 0/4][v2] Error condition failure fixes Josef Bacik
@ 2020-02-13 15:47 ` Josef Bacik
0 siblings, 0 replies; 20+ messages in thread
From: Josef Bacik @ 2020-02-13 15:47 UTC (permalink / raw)
To: linux-btrfs, kernel-team; +Cc: Qu Wenruo
I hit the following warning while running my error injection stress testing
------------[ cut here ]------------
WARNING: CPU: 3 PID: 1453 at fs/btrfs/space-info.h:108 btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs]
RIP: 0010:btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs]
Call Trace:
btrfs_free_reserved_data_space+0x4f/0x70 [btrfs]
__btrfs_prealloc_file_range+0x378/0x470 [btrfs]
elfcorehdr_read+0x40/0x40
? elfcorehdr_read+0x40/0x40
? btrfs_commit_transaction+0xca/0xa50 [btrfs]
? dput+0xb4/0x2a0
? btrfs_log_dentry_safe+0x55/0x70 [btrfs]
? btrfs_sync_file+0x30e/0x420 [btrfs]
? do_fsync+0x38/0x70
? __x64_sys_fdatasync+0x13/0x20
? do_syscall_64+0x5b/0x1b0
? entry_SYSCALL_64_after_hwframe+0x44/0xa9
---[ end trace 70ccb5d0fe51151c ]---
This happens if we fail to insert our reserved file extent. At this
point we've already converted our reservation from ->bytes_may_use to
->bytes_reserved. However once we break we will attempt to free
everything from [cur_offset, end] from ->bytes_may_use, but our extent
reservation will overlap part of this.
Fix this problem by adding ins.offset (our extent allocation size) to
cur_offset so we remove the actual remaining part from ->bytes_may_use.
I validated this fix using my inject-error.py script
python inject-error.py -o should_fail_bio -t cache_save_setup -t \
__btrfs_prealloc_file_range \
-t insert_reserved_file_extent.constprop.0 \
-r "-5" ./run-fsstress.sh
where run-fsstress.sh simply mounts and runs fsstress on a disk.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
fs/btrfs/inode.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 84e649724549..ffc6fcfe805c 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -9876,6 +9876,7 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode,
struct btrfs_root *root = BTRFS_I(inode)->root;
struct btrfs_key ins;
u64 cur_offset = start;
+ u64 clear_offset = start;
u64 i_size;
u64 cur_bytes;
u64 last_alloc = (u64)-1;
@@ -9910,6 +9911,15 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode,
btrfs_end_transaction(trans);
break;
}
+
+ /*
+ * We've reserved this space, and thus converted it from
+ * ->bytes_may_use to ->bytes_reserved. Any error that happens
+ * from here on out we will only need to clear our reservation
+ * for the remaining unreserved area, so advance our
+ * clear_offset by our extent size.
+ */
+ clear_offset += ins.offset;
btrfs_dec_block_group_reservations(fs_info, ins.objectid);
last_alloc = ins.offset;
@@ -9989,9 +9999,9 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode,
if (own_trans)
btrfs_end_transaction(trans);
}
- if (cur_offset < end)
- btrfs_free_reserved_data_space(inode, NULL, cur_offset,
- end - cur_offset + 1);
+ if (clear_offset < end)
+ btrfs_free_reserved_data_space(inode, NULL, clear_offset,
+ end - clear_offset + 1);
return ret;
}
--
2.24.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
end of thread, other threads:[~2020-02-13 15:47 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-11 21:40 [PATCH 0/4] Error condition failure fixes Josef Bacik
2020-02-11 21:40 ` [PATCH 1/4] btrfs: set fs_root = NULL on error Josef Bacik
2020-02-12 0:55 ` Qu Wenruo
2020-02-13 10:05 ` Johannes Thumshirn
2020-02-13 10:48 ` Nikolay Borisov
2020-02-13 15:31 ` Josef Bacik
2020-02-13 15:32 ` Nikolay Borisov
2020-02-11 21:40 ` [PATCH 2/4] btrfs: do not check delayed items are empty for single trans cleanup Josef Bacik
2020-02-12 0:57 ` Qu Wenruo
2020-02-13 10:28 ` Nikolay Borisov
2020-02-13 11:15 ` Johannes Thumshirn
2020-02-11 21:40 ` [PATCH 3/4] btrfs: handle logged extent failure properly Josef Bacik
2020-02-12 0:58 ` Qu Wenruo
2020-02-13 10:26 ` Nikolay Borisov
2020-02-13 11:16 ` Johannes Thumshirn
2020-02-11 21:40 ` [PATCH 4/4] btrfs: fix bytes_may_use underflow in prealloc error condtition Josef Bacik
2020-02-12 1:00 ` Qu Wenruo
2020-02-13 10:17 ` Nikolay Borisov
2020-02-13 15:29 ` Josef Bacik
2020-02-13 15:47 [PATCH 0/4][v2] Error condition failure fixes Josef Bacik
2020-02-13 15:47 ` [PATCH 4/4] btrfs: fix bytes_may_use underflow in prealloc error condtition Josef Bacik
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).