All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 1/2] btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space
@ 2019-09-16 12:02 Qu Wenruo
  2019-09-16 12:02 ` [PATCH v2 2/2] btrfs: qgroup: Fix reserved data space leak if we have multiple reserve calls Qu Wenruo
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Qu Wenruo @ 2019-09-16 12:02 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Josef Bacik

[BUG]
Under the follow case with qgroup enabled, if some error happened after
we have reserved delalloc space, then in error handling path, we could
cause qgroup data space leakage:

From btrfs_truncate_block() in inode.c:

	ret = btrfs_delalloc_reserve_space(inode, &data_reserved,
					   block_start, blocksize);
	if (ret)
		goto out;

again:
	page = find_or_create_page(mapping, index, mask);
	if (!page) {
		btrfs_delalloc_release_space(inode, data_reserved,
					     block_start, blocksize, true);
		btrfs_delalloc_release_extents(BTRFS_I(inode), blocksize, true);
		ret = -ENOMEM;
		goto out;
	}

[CAUSE]
In above case, btrfs_delalloc_reserve_space() will call
btrfs_qgroup_reserve_data() and mark the io_tree range with
EXTENT_QGROUP_RESERVED flag.

In the error handling path, we have the following call stack:
btrfs_delalloc_release_space()
|- btrfs_free_reserved_data_space()
   |- btrsf_qgroup_free_data()
      |- __btrfs_qgroup_release_data(reserved=@reserved, free=1)
         |- qgroup_free_reserved_data(reserved=@reserved)
            |- clear_record_extent_bits();
            |- freed += changeset.bytes_changed;

However due to a completion bug, qgroup_free_reserved_data() will clear
EXTENT_QGROUP_RESERVED flag in BTRFS_I(inode)->io_failure_tree, other
than the correct BTRFS_I(inode)->io_tree.
Since io_failure_tree is never marked with that flag,
btrfs_qgroup_free_data() will not free any data reserved space at all,
causing a leakage.

This type of error handling can only be triggered by errors outside of
qgroup code. So EDQUOT error from qgroup can't trigger it.

[FIX]
Fix the wrong target io_tree.

Reported-by: Josef Bacik <josef@toxicpanda.com>
Fixes: bc42bda22345 ("btrfs: qgroup: Fix qgroup reserved space underflow by only freeing reserved ranges")
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
Changelog:
v2:
- Commit message polishment
  Use proper call chain to describe the error, as it's pretty deep.
  And rephrase how to trigger the bug.
---
 fs/btrfs/qgroup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 2891b57b9e1e..64bdc3e3652d 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -3492,7 +3492,7 @@ static int qgroup_free_reserved_data(struct inode *inode,
 		 * EXTENT_QGROUP_RESERVED, we won't double free.
 		 * So not need to rush.
 		 */
-		ret = clear_record_extent_bits(&BTRFS_I(inode)->io_failure_tree,
+		ret = clear_record_extent_bits(&BTRFS_I(inode)->io_tree,
 				free_start, free_start + free_len - 1,
 				EXTENT_QGROUP_RESERVED, &changeset);
 		if (ret < 0)
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 2/2] btrfs: qgroup: Fix reserved data space leak if we have multiple reserve calls
  2019-09-16 12:02 [PATCH v2 1/2] btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space Qu Wenruo
@ 2019-09-16 12:02 ` Qu Wenruo
  2019-09-24  9:12   ` Nikolay Borisov
  2019-09-24  9:03 ` [PATCH v2 1/2] btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space Nikolay Borisov
  2019-09-24 16:22 ` David Sterba
  2 siblings, 1 reply; 6+ messages in thread
From: Qu Wenruo @ 2019-09-16 12:02 UTC (permalink / raw)
  To: linux-btrfs

[BUG]
The following script can cause btrfs qgroup data space leak:

  mkfs.btrfs -f $dev
  mount $dev -o nospace_cache $mnt

  btrfs subv create $mnt/subv
  btrfs quota en $mnt
  btrfs quota rescan -w $mnt
  btrfs qgroup limit 128m $mnt/subv

  for (( i = 0; i < 3; i++)); do
          # Create 3 64M holes for latter fallocate to fail
          truncate -s 192m $mnt/subv/file
          xfs_io -c "pwrite 64m 4k" $mnt/subv/file > /dev/null
          xfs_io -c "pwrite 128m 4k" $mnt/subv/file > /dev/null
          sync

          # it's supposed to fail, and each failure will leak at least 64M
          # data space
          xfs_io -f -c "falloc 0 192m" $mnt/subv/file &> /dev/null
          rm $mnt/subv/file
          sync
  done

  # Shouldn't fail after we removed the file
  xfs_io -f -c "falloc 0 64m" $mnt/subv/file

[CAUSE]
Btrfs qgroup data reserve code allow multiple reservations to happen on
a single extent_changeset:
E.g:
	btrfs_qgroup_reserve_data(inode, &data_reserved, 0, SZ_1M);
	btrfs_qgroup_reserve_data(inode, &data_reserved, SZ_1M, SZ_2M);
	btrfs_qgroup_reserve_data(inode, &data_reserved, 0, SZ_4M);

Btrfs qgroup code has its internal tracking to make sure we don't
double-reserve in above example.

The only pattern utlizing this feature is in the main while loop of
btrfs_fallocate() function.

However btrfs_qgroup_reserve_data()'s error handling has a bug in that
on error it clears all ranges in the io_tree with EXTENT_QGROUP_RESERVED
flag but doesn't free previously reserved bytes.

This bug has a two fold effect:
- Clearing EXTENT_QGROUP_RESERVED ranges
  This is the correct behavior, but it prevents
  btrfs_qgroup_check_reserved_leak() to catch the leakage as the
  detector is purely EXTENT_QGROUP_RESERVED flag based.

- Leak the previously reserved data bytes.

The bug manifests when N calls to btrfs_qgroup_reserve_data are made and
the last one fails, leaking space reserved in the previous ones.

[FIX]
Also free previously reserved data bytes when btrfs_qgroup_reserve_data
fails.

Fixes: 524725537023 ("btrfs: qgroup: Introduce btrfs_qgroup_reserve_data function")
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
Changelog:
v2:
- Commit message polishment
  Rephrase the multiple btrfs_qgroup_reserve_data() calls ability.
  Rephrase the effect of the bug.
---
 fs/btrfs/qgroup.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 64bdc3e3652d..59f6a9981087 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -3448,6 +3448,9 @@ int btrfs_qgroup_reserve_data(struct inode *inode,
 	while ((unode = ulist_next(&reserved->range_changed, &uiter)))
 		clear_extent_bit(&BTRFS_I(inode)->io_tree, unode->val,
 				 unode->aux, EXTENT_QGROUP_RESERVED, 0, 0, NULL);
+	/* Also free data bytes of already reserved one */
+	btrfs_qgroup_free_refroot(root->fs_info, root->root_key.objectid,
+				  orig_reserved, BTRFS_QGROUP_RSV_DATA);
 	extent_changeset_release(reserved);
 	return ret;
 }
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/2] btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space
  2019-09-16 12:02 [PATCH v2 1/2] btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space Qu Wenruo
  2019-09-16 12:02 ` [PATCH v2 2/2] btrfs: qgroup: Fix reserved data space leak if we have multiple reserve calls Qu Wenruo
@ 2019-09-24  9:03 ` Nikolay Borisov
  2019-09-24 16:22 ` David Sterba
  2 siblings, 0 replies; 6+ messages in thread
From: Nikolay Borisov @ 2019-09-24  9:03 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs; +Cc: Josef Bacik



On 16.09.19 г. 15:02 ч., Qu Wenruo wrote:
> [BUG]
> Under the follow case with qgroup enabled, if some error happened after
> we have reserved delalloc space, then in error handling path, we could
> cause qgroup data space leakage:
> 
> From btrfs_truncate_block() in inode.c:
> 
> 	ret = btrfs_delalloc_reserve_space(inode, &data_reserved,
> 					   block_start, blocksize);
> 	if (ret)
> 		goto out;
> 
> again:
> 	page = find_or_create_page(mapping, index, mask);
> 	if (!page) {
> 		btrfs_delalloc_release_space(inode, data_reserved,
> 					     block_start, blocksize, true);
> 		btrfs_delalloc_release_extents(BTRFS_I(inode), blocksize, true);
> 		ret = -ENOMEM;
> 		goto out;
> 	}
> 
> [CAUSE]
> In above case, btrfs_delalloc_reserve_space() will call
> btrfs_qgroup_reserve_data() and mark the io_tree range with
> EXTENT_QGROUP_RESERVED flag.
> 
> In the error handling path, we have the following call stack:
> btrfs_delalloc_release_space()
> |- btrfs_free_reserved_data_space()
>    |- btrsf_qgroup_free_data()
>       |- __btrfs_qgroup_release_data(reserved=@reserved, free=1)
>          |- qgroup_free_reserved_data(reserved=@reserved)
>             |- clear_record_extent_bits();
>             |- freed += changeset.bytes_changed;
> 
> However due to a completion bug, qgroup_free_reserved_data() will clear
> EXTENT_QGROUP_RESERVED flag in BTRFS_I(inode)->io_failure_tree, other
> than the correct BTRFS_I(inode)->io_tree.
> Since io_failure_tree is never marked with that flag,
> btrfs_qgroup_free_data() will not free any data reserved space at all,
> causing a leakage.
> 
> This type of error handling can only be triggered by errors outside of
> qgroup code. So EDQUOT error from qgroup can't trigger it.
> 
> [FIX]
> Fix the wrong target io_tree.
> 
> Reported-by: Josef Bacik <josef@toxicpanda.com>
> Fixes: bc42bda22345 ("btrfs: qgroup: Fix qgroup reserved space underflow by only freeing reserved ranges")
> Signed-off-by: Qu Wenruo <wqu@suse.com>

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

> ---
> Changelog:
> v2:
> - Commit message polishment
>   Use proper call chain to describe the error, as it's pretty deep.
>   And rephrase how to trigger the bug.
> ---
>  fs/btrfs/qgroup.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> index 2891b57b9e1e..64bdc3e3652d 100644
> --- a/fs/btrfs/qgroup.c
> +++ b/fs/btrfs/qgroup.c
> @@ -3492,7 +3492,7 @@ static int qgroup_free_reserved_data(struct inode *inode,
>  		 * EXTENT_QGROUP_RESERVED, we won't double free.
>  		 * So not need to rush.
>  		 */
> -		ret = clear_record_extent_bits(&BTRFS_I(inode)->io_failure_tree,
> +		ret = clear_record_extent_bits(&BTRFS_I(inode)->io_tree,
>  				free_start, free_start + free_len - 1,
>  				EXTENT_QGROUP_RESERVED, &changeset);
>  		if (ret < 0)
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 2/2] btrfs: qgroup: Fix reserved data space leak if we have multiple reserve calls
  2019-09-16 12:02 ` [PATCH v2 2/2] btrfs: qgroup: Fix reserved data space leak if we have multiple reserve calls Qu Wenruo
@ 2019-09-24  9:12   ` Nikolay Borisov
  2019-09-24  9:15     ` Qu Wenruo
  0 siblings, 1 reply; 6+ messages in thread
From: Nikolay Borisov @ 2019-09-24  9:12 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



On 16.09.19 г. 15:02 ч., Qu Wenruo wrote:
> [BUG]
> The following script can cause btrfs qgroup data space leak:
> 
>   mkfs.btrfs -f $dev
>   mount $dev -o nospace_cache $mnt
> 
>   btrfs subv create $mnt/subv
>   btrfs quota en $mnt
>   btrfs quota rescan -w $mnt
>   btrfs qgroup limit 128m $mnt/subv
> 
>   for (( i = 0; i < 3; i++)); do
>           # Create 3 64M holes for latter fallocate to fail
>           truncate -s 192m $mnt/subv/file
>           xfs_io -c "pwrite 64m 4k" $mnt/subv/file > /dev/null
>           xfs_io -c "pwrite 128m 4k" $mnt/subv/file > /dev/null
>           sync
> 
>           # it's supposed to fail, and each failure will leak at least 64M
>           # data space
>           xfs_io -f -c "falloc 0 192m" $mnt/subv/file &> /dev/null
>           rm $mnt/subv/file
>           sync
>   done
> 
>   # Shouldn't fail after we removed the file
>   xfs_io -f -c "falloc 0 64m" $mnt/subv/file

Was this sent as a separate fstest case?

<snip>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 2/2] btrfs: qgroup: Fix reserved data space leak if we have multiple reserve calls
  2019-09-24  9:12   ` Nikolay Borisov
@ 2019-09-24  9:15     ` Qu Wenruo
  0 siblings, 0 replies; 6+ messages in thread
From: Qu Wenruo @ 2019-09-24  9:15 UTC (permalink / raw)
  To: Nikolay Borisov, Qu Wenruo, linux-btrfs



On 2019/9/24 下午5:12, Nikolay Borisov wrote:
>
>
> On 16.09.19 г. 15:02 ч., Qu Wenruo wrote:
>> [BUG]
>> The following script can cause btrfs qgroup data space leak:
>>
>>   mkfs.btrfs -f $dev
>>   mount $dev -o nospace_cache $mnt
>>
>>   btrfs subv create $mnt/subv
>>   btrfs quota en $mnt
>>   btrfs quota rescan -w $mnt
>>   btrfs qgroup limit 128m $mnt/subv
>>
>>   for (( i = 0; i < 3; i++)); do
>>           # Create 3 64M holes for latter fallocate to fail
>>           truncate -s 192m $mnt/subv/file
>>           xfs_io -c "pwrite 64m 4k" $mnt/subv/file > /dev/null
>>           xfs_io -c "pwrite 128m 4k" $mnt/subv/file > /dev/null
>>           sync
>>
>>           # it's supposed to fail, and each failure will leak at least 64M
>>           # data space
>>           xfs_io -f -c "falloc 0 192m" $mnt/subv/file &> /dev/null
>>           rm $mnt/subv/file
>>           sync
>>   done
>>
>>   # Shouldn't fail after we removed the file
>>   xfs_io -f -c "falloc 0 64m" $mnt/subv/file
>
> Was this sent as a separate fstest case?

Yep.

https://patchwork.kernel.org/patch/11145871/

Thanks,
Qu

>
> <snip>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/2] btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space
  2019-09-16 12:02 [PATCH v2 1/2] btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space Qu Wenruo
  2019-09-16 12:02 ` [PATCH v2 2/2] btrfs: qgroup: Fix reserved data space leak if we have multiple reserve calls Qu Wenruo
  2019-09-24  9:03 ` [PATCH v2 1/2] btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space Nikolay Borisov
@ 2019-09-24 16:22 ` David Sterba
  2 siblings, 0 replies; 6+ messages in thread
From: David Sterba @ 2019-09-24 16:22 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, Josef Bacik

On Mon, Sep 16, 2019 at 08:02:38PM +0800, Qu Wenruo wrote:
[...]

1 and 2 added to misc-next, thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-09-24 16:22 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-16 12:02 [PATCH v2 1/2] btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space Qu Wenruo
2019-09-16 12:02 ` [PATCH v2 2/2] btrfs: qgroup: Fix reserved data space leak if we have multiple reserve calls Qu Wenruo
2019-09-24  9:12   ` Nikolay Borisov
2019-09-24  9:15     ` Qu Wenruo
2019-09-24  9:03 ` [PATCH v2 1/2] btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space Nikolay Borisov
2019-09-24 16:22 ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.