* [PATCH v2 1/2] btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space
@ 2019-09-16 12:02 Qu Wenruo
2019-09-16 12:02 ` [PATCH v2 2/2] btrfs: qgroup: Fix reserved data space leak if we have multiple reserve calls Qu Wenruo
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Qu Wenruo @ 2019-09-16 12:02 UTC (permalink / raw)
To: linux-btrfs; +Cc: Josef Bacik
[BUG]
Under the follow case with qgroup enabled, if some error happened after
we have reserved delalloc space, then in error handling path, we could
cause qgroup data space leakage:
From btrfs_truncate_block() in inode.c:
ret = btrfs_delalloc_reserve_space(inode, &data_reserved,
block_start, blocksize);
if (ret)
goto out;
again:
page = find_or_create_page(mapping, index, mask);
if (!page) {
btrfs_delalloc_release_space(inode, data_reserved,
block_start, blocksize, true);
btrfs_delalloc_release_extents(BTRFS_I(inode), blocksize, true);
ret = -ENOMEM;
goto out;
}
[CAUSE]
In above case, btrfs_delalloc_reserve_space() will call
btrfs_qgroup_reserve_data() and mark the io_tree range with
EXTENT_QGROUP_RESERVED flag.
In the error handling path, we have the following call stack:
btrfs_delalloc_release_space()
|- btrfs_free_reserved_data_space()
|- btrsf_qgroup_free_data()
|- __btrfs_qgroup_release_data(reserved=@reserved, free=1)
|- qgroup_free_reserved_data(reserved=@reserved)
|- clear_record_extent_bits();
|- freed += changeset.bytes_changed;
However due to a completion bug, qgroup_free_reserved_data() will clear
EXTENT_QGROUP_RESERVED flag in BTRFS_I(inode)->io_failure_tree, other
than the correct BTRFS_I(inode)->io_tree.
Since io_failure_tree is never marked with that flag,
btrfs_qgroup_free_data() will not free any data reserved space at all,
causing a leakage.
This type of error handling can only be triggered by errors outside of
qgroup code. So EDQUOT error from qgroup can't trigger it.
[FIX]
Fix the wrong target io_tree.
Reported-by: Josef Bacik <josef@toxicpanda.com>
Fixes: bc42bda22345 ("btrfs: qgroup: Fix qgroup reserved space underflow by only freeing reserved ranges")
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
Changelog:
v2:
- Commit message polishment
Use proper call chain to describe the error, as it's pretty deep.
And rephrase how to trigger the bug.
---
fs/btrfs/qgroup.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 2891b57b9e1e..64bdc3e3652d 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -3492,7 +3492,7 @@ static int qgroup_free_reserved_data(struct inode *inode,
* EXTENT_QGROUP_RESERVED, we won't double free.
* So not need to rush.
*/
- ret = clear_record_extent_bits(&BTRFS_I(inode)->io_failure_tree,
+ ret = clear_record_extent_bits(&BTRFS_I(inode)->io_tree,
free_start, free_start + free_len - 1,
EXTENT_QGROUP_RESERVED, &changeset);
if (ret < 0)
--
2.23.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2 2/2] btrfs: qgroup: Fix reserved data space leak if we have multiple reserve calls
2019-09-16 12:02 [PATCH v2 1/2] btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space Qu Wenruo
@ 2019-09-16 12:02 ` Qu Wenruo
2019-09-24 9:12 ` Nikolay Borisov
2019-09-24 9:03 ` [PATCH v2 1/2] btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space Nikolay Borisov
2019-09-24 16:22 ` David Sterba
2 siblings, 1 reply; 6+ messages in thread
From: Qu Wenruo @ 2019-09-16 12:02 UTC (permalink / raw)
To: linux-btrfs
[BUG]
The following script can cause btrfs qgroup data space leak:
mkfs.btrfs -f $dev
mount $dev -o nospace_cache $mnt
btrfs subv create $mnt/subv
btrfs quota en $mnt
btrfs quota rescan -w $mnt
btrfs qgroup limit 128m $mnt/subv
for (( i = 0; i < 3; i++)); do
# Create 3 64M holes for latter fallocate to fail
truncate -s 192m $mnt/subv/file
xfs_io -c "pwrite 64m 4k" $mnt/subv/file > /dev/null
xfs_io -c "pwrite 128m 4k" $mnt/subv/file > /dev/null
sync
# it's supposed to fail, and each failure will leak at least 64M
# data space
xfs_io -f -c "falloc 0 192m" $mnt/subv/file &> /dev/null
rm $mnt/subv/file
sync
done
# Shouldn't fail after we removed the file
xfs_io -f -c "falloc 0 64m" $mnt/subv/file
[CAUSE]
Btrfs qgroup data reserve code allow multiple reservations to happen on
a single extent_changeset:
E.g:
btrfs_qgroup_reserve_data(inode, &data_reserved, 0, SZ_1M);
btrfs_qgroup_reserve_data(inode, &data_reserved, SZ_1M, SZ_2M);
btrfs_qgroup_reserve_data(inode, &data_reserved, 0, SZ_4M);
Btrfs qgroup code has its internal tracking to make sure we don't
double-reserve in above example.
The only pattern utlizing this feature is in the main while loop of
btrfs_fallocate() function.
However btrfs_qgroup_reserve_data()'s error handling has a bug in that
on error it clears all ranges in the io_tree with EXTENT_QGROUP_RESERVED
flag but doesn't free previously reserved bytes.
This bug has a two fold effect:
- Clearing EXTENT_QGROUP_RESERVED ranges
This is the correct behavior, but it prevents
btrfs_qgroup_check_reserved_leak() to catch the leakage as the
detector is purely EXTENT_QGROUP_RESERVED flag based.
- Leak the previously reserved data bytes.
The bug manifests when N calls to btrfs_qgroup_reserve_data are made and
the last one fails, leaking space reserved in the previous ones.
[FIX]
Also free previously reserved data bytes when btrfs_qgroup_reserve_data
fails.
Fixes: 524725537023 ("btrfs: qgroup: Introduce btrfs_qgroup_reserve_data function")
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
Changelog:
v2:
- Commit message polishment
Rephrase the multiple btrfs_qgroup_reserve_data() calls ability.
Rephrase the effect of the bug.
---
fs/btrfs/qgroup.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 64bdc3e3652d..59f6a9981087 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -3448,6 +3448,9 @@ int btrfs_qgroup_reserve_data(struct inode *inode,
while ((unode = ulist_next(&reserved->range_changed, &uiter)))
clear_extent_bit(&BTRFS_I(inode)->io_tree, unode->val,
unode->aux, EXTENT_QGROUP_RESERVED, 0, 0, NULL);
+ /* Also free data bytes of already reserved one */
+ btrfs_qgroup_free_refroot(root->fs_info, root->root_key.objectid,
+ orig_reserved, BTRFS_QGROUP_RSV_DATA);
extent_changeset_release(reserved);
return ret;
}
--
2.23.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2 1/2] btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space
2019-09-16 12:02 [PATCH v2 1/2] btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space Qu Wenruo
2019-09-16 12:02 ` [PATCH v2 2/2] btrfs: qgroup: Fix reserved data space leak if we have multiple reserve calls Qu Wenruo
@ 2019-09-24 9:03 ` Nikolay Borisov
2019-09-24 16:22 ` David Sterba
2 siblings, 0 replies; 6+ messages in thread
From: Nikolay Borisov @ 2019-09-24 9:03 UTC (permalink / raw)
To: Qu Wenruo, linux-btrfs; +Cc: Josef Bacik
On 16.09.19 г. 15:02 ч., Qu Wenruo wrote:
> [BUG]
> Under the follow case with qgroup enabled, if some error happened after
> we have reserved delalloc space, then in error handling path, we could
> cause qgroup data space leakage:
>
> From btrfs_truncate_block() in inode.c:
>
> ret = btrfs_delalloc_reserve_space(inode, &data_reserved,
> block_start, blocksize);
> if (ret)
> goto out;
>
> again:
> page = find_or_create_page(mapping, index, mask);
> if (!page) {
> btrfs_delalloc_release_space(inode, data_reserved,
> block_start, blocksize, true);
> btrfs_delalloc_release_extents(BTRFS_I(inode), blocksize, true);
> ret = -ENOMEM;
> goto out;
> }
>
> [CAUSE]
> In above case, btrfs_delalloc_reserve_space() will call
> btrfs_qgroup_reserve_data() and mark the io_tree range with
> EXTENT_QGROUP_RESERVED flag.
>
> In the error handling path, we have the following call stack:
> btrfs_delalloc_release_space()
> |- btrfs_free_reserved_data_space()
> |- btrsf_qgroup_free_data()
> |- __btrfs_qgroup_release_data(reserved=@reserved, free=1)
> |- qgroup_free_reserved_data(reserved=@reserved)
> |- clear_record_extent_bits();
> |- freed += changeset.bytes_changed;
>
> However due to a completion bug, qgroup_free_reserved_data() will clear
> EXTENT_QGROUP_RESERVED flag in BTRFS_I(inode)->io_failure_tree, other
> than the correct BTRFS_I(inode)->io_tree.
> Since io_failure_tree is never marked with that flag,
> btrfs_qgroup_free_data() will not free any data reserved space at all,
> causing a leakage.
>
> This type of error handling can only be triggered by errors outside of
> qgroup code. So EDQUOT error from qgroup can't trigger it.
>
> [FIX]
> Fix the wrong target io_tree.
>
> Reported-by: Josef Bacik <josef@toxicpanda.com>
> Fixes: bc42bda22345 ("btrfs: qgroup: Fix qgroup reserved space underflow by only freeing reserved ranges")
> Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> ---
> Changelog:
> v2:
> - Commit message polishment
> Use proper call chain to describe the error, as it's pretty deep.
> And rephrase how to trigger the bug.
> ---
> fs/btrfs/qgroup.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> index 2891b57b9e1e..64bdc3e3652d 100644
> --- a/fs/btrfs/qgroup.c
> +++ b/fs/btrfs/qgroup.c
> @@ -3492,7 +3492,7 @@ static int qgroup_free_reserved_data(struct inode *inode,
> * EXTENT_QGROUP_RESERVED, we won't double free.
> * So not need to rush.
> */
> - ret = clear_record_extent_bits(&BTRFS_I(inode)->io_failure_tree,
> + ret = clear_record_extent_bits(&BTRFS_I(inode)->io_tree,
> free_start, free_start + free_len - 1,
> EXTENT_QGROUP_RESERVED, &changeset);
> if (ret < 0)
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2 2/2] btrfs: qgroup: Fix reserved data space leak if we have multiple reserve calls
2019-09-16 12:02 ` [PATCH v2 2/2] btrfs: qgroup: Fix reserved data space leak if we have multiple reserve calls Qu Wenruo
@ 2019-09-24 9:12 ` Nikolay Borisov
2019-09-24 9:15 ` Qu Wenruo
0 siblings, 1 reply; 6+ messages in thread
From: Nikolay Borisov @ 2019-09-24 9:12 UTC (permalink / raw)
To: Qu Wenruo, linux-btrfs
On 16.09.19 г. 15:02 ч., Qu Wenruo wrote:
> [BUG]
> The following script can cause btrfs qgroup data space leak:
>
> mkfs.btrfs -f $dev
> mount $dev -o nospace_cache $mnt
>
> btrfs subv create $mnt/subv
> btrfs quota en $mnt
> btrfs quota rescan -w $mnt
> btrfs qgroup limit 128m $mnt/subv
>
> for (( i = 0; i < 3; i++)); do
> # Create 3 64M holes for latter fallocate to fail
> truncate -s 192m $mnt/subv/file
> xfs_io -c "pwrite 64m 4k" $mnt/subv/file > /dev/null
> xfs_io -c "pwrite 128m 4k" $mnt/subv/file > /dev/null
> sync
>
> # it's supposed to fail, and each failure will leak at least 64M
> # data space
> xfs_io -f -c "falloc 0 192m" $mnt/subv/file &> /dev/null
> rm $mnt/subv/file
> sync
> done
>
> # Shouldn't fail after we removed the file
> xfs_io -f -c "falloc 0 64m" $mnt/subv/file
Was this sent as a separate fstest case?
<snip>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2 2/2] btrfs: qgroup: Fix reserved data space leak if we have multiple reserve calls
2019-09-24 9:12 ` Nikolay Borisov
@ 2019-09-24 9:15 ` Qu Wenruo
0 siblings, 0 replies; 6+ messages in thread
From: Qu Wenruo @ 2019-09-24 9:15 UTC (permalink / raw)
To: Nikolay Borisov, Qu Wenruo, linux-btrfs
On 2019/9/24 下午5:12, Nikolay Borisov wrote:
>
>
> On 16.09.19 г. 15:02 ч., Qu Wenruo wrote:
>> [BUG]
>> The following script can cause btrfs qgroup data space leak:
>>
>> mkfs.btrfs -f $dev
>> mount $dev -o nospace_cache $mnt
>>
>> btrfs subv create $mnt/subv
>> btrfs quota en $mnt
>> btrfs quota rescan -w $mnt
>> btrfs qgroup limit 128m $mnt/subv
>>
>> for (( i = 0; i < 3; i++)); do
>> # Create 3 64M holes for latter fallocate to fail
>> truncate -s 192m $mnt/subv/file
>> xfs_io -c "pwrite 64m 4k" $mnt/subv/file > /dev/null
>> xfs_io -c "pwrite 128m 4k" $mnt/subv/file > /dev/null
>> sync
>>
>> # it's supposed to fail, and each failure will leak at least 64M
>> # data space
>> xfs_io -f -c "falloc 0 192m" $mnt/subv/file &> /dev/null
>> rm $mnt/subv/file
>> sync
>> done
>>
>> # Shouldn't fail after we removed the file
>> xfs_io -f -c "falloc 0 64m" $mnt/subv/file
>
> Was this sent as a separate fstest case?
Yep.
https://patchwork.kernel.org/patch/11145871/
Thanks,
Qu
>
> <snip>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2 1/2] btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space
2019-09-16 12:02 [PATCH v2 1/2] btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space Qu Wenruo
2019-09-16 12:02 ` [PATCH v2 2/2] btrfs: qgroup: Fix reserved data space leak if we have multiple reserve calls Qu Wenruo
2019-09-24 9:03 ` [PATCH v2 1/2] btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space Nikolay Borisov
@ 2019-09-24 16:22 ` David Sterba
2 siblings, 0 replies; 6+ messages in thread
From: David Sterba @ 2019-09-24 16:22 UTC (permalink / raw)
To: Qu Wenruo; +Cc: linux-btrfs, Josef Bacik
On Mon, Sep 16, 2019 at 08:02:38PM +0800, Qu Wenruo wrote:
[...]
1 and 2 added to misc-next, thanks.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-09-24 16:22 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-16 12:02 [PATCH v2 1/2] btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space Qu Wenruo
2019-09-16 12:02 ` [PATCH v2 2/2] btrfs: qgroup: Fix reserved data space leak if we have multiple reserve calls Qu Wenruo
2019-09-24 9:12 ` Nikolay Borisov
2019-09-24 9:15 ` Qu Wenruo
2019-09-24 9:03 ` [PATCH v2 1/2] btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space Nikolay Borisov
2019-09-24 16:22 ` David Sterba
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).