All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] btrfs: Ensure proper sector alignment for btrfs_free_reserved_data_space
@ 2016-11-19  2:52 Jeff Mahoney
  2016-11-20 15:50 ` Liu Bo
  2016-11-21  0:57 ` Qu Wenruo
  0 siblings, 2 replies; 4+ messages in thread
From: Jeff Mahoney @ 2016-11-19  2:52 UTC (permalink / raw)
  To: linux-btrfs

From: Jeff Mahoney <jeffm@suse.com>
Subject: btrfs: Ensure proper sector alignment for
 btrfs_free_reserved_data_space
References: bsc#1005666
Patch-mainline: Submitted 18 Nov 2016, linux-btrfs

This fixes the WARN_ON on BTRFS_I(inode)->reserved_extents in
btrfs_destroy_inode and the WARN_ON on nonzero delalloc bytes on umount
with qgroups enabled.

I was able to reproduce this by setting up a small (~500kb) quota limit
and writing a file one byte at a time until I hit the limit.  The warnings
would all hit on umount.

The root cause is that we would reserve a block-sized range in both
the reservation and the quota in btrfs_check_data_free_space, but if we
encountered a problem (like e.g. EDQUOT), we would only release the single
byte in the qgroup reservation.  That caused an iotree state split, which
increased the number of outstanding extents, in turn disallowing releasing
the metadata reservation.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
---
 fs/btrfs/extent-tree.c |    7 +++++++
 1 file changed, 7 insertions(+)

--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3822,6 +3822,13 @@ void btrfs_free_reserved_data_space_noqu
  */
 void btrfs_free_reserved_data_space(struct inode *inode, u64 start, u64 len)
 {
+	struct btrfs_root *root = BTRFS_I(inode)->root;
+
+	/* Make sure the range is aligned to sectorsize */
+	len = round_up(start + len, root->sectorsize) -
+	      round_down(start, root->sectorsize);
+	start = round_down(start, root->sectorsize);
+
 	btrfs_free_reserved_data_space_noquota(inode, start, len);
 	btrfs_qgroup_free_data(inode, start, len);
 }


-- 
Jeff Mahoney
SUSE Labs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] btrfs: Ensure proper sector alignment for btrfs_free_reserved_data_space
  2016-11-19  2:52 [PATCH] btrfs: Ensure proper sector alignment for btrfs_free_reserved_data_space Jeff Mahoney
@ 2016-11-20 15:50 ` Liu Bo
  2016-11-20 17:54   ` Jeff Mahoney
  2016-11-21  0:57 ` Qu Wenruo
  1 sibling, 1 reply; 4+ messages in thread
From: Liu Bo @ 2016-11-20 15:50 UTC (permalink / raw)
  To: Jeff Mahoney; +Cc: linux-btrfs

On Fri, Nov 18, 2016 at 09:52:40PM -0500, Jeff Mahoney wrote:
> From: Jeff Mahoney <jeffm@suse.com>
> Subject: btrfs: Ensure proper sector alignment for
>  btrfs_free_reserved_data_space
> References: bsc#1005666
> Patch-mainline: Submitted 18 Nov 2016, linux-btrfs
> 
> This fixes the WARN_ON on BTRFS_I(inode)->reserved_extents in
> btrfs_destroy_inode and the WARN_ON on nonzero delalloc bytes on umount
> with qgroups enabled.
> 
> I was able to reproduce this by setting up a small (~500kb) quota limit
> and writing a file one byte at a time until I hit the limit.  The warnings
> would all hit on umount.
> 
> The root cause is that we would reserve a block-sized range in both
> the reservation and the quota in btrfs_check_data_free_space, but if we
> encountered a problem (like e.g. EDQUOT), we would only release the single
> byte in the qgroup reservation.  That caused an iotree state split, which
> increased the number of outstanding extents, in turn disallowing releasing
> the metadata reservation.
> 
> Signed-off-by: Jeff Mahoney <jeffm@suse.com>
> ---
>  fs/btrfs/extent-tree.c |    7 +++++++
>  1 file changed, 7 insertions(+)
> 
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -3822,6 +3822,13 @@ void btrfs_free_reserved_data_space_noqu
>   */
>  void btrfs_free_reserved_data_space(struct inode *inode, u64 start, u64 len)
>  {
> +	struct btrfs_root *root = BTRFS_I(inode)->root;
> +
> +	/* Make sure the range is aligned to sectorsize */
> +	len = round_up(start + len, root->sectorsize) -
> +	      round_down(start, root->sectorsize);
> +	start = round_down(start, root->sectorsize);
> +
>  	btrfs_free_reserved_data_space_noquota(inode, start, len);
>  	btrfs_qgroup_free_data(inode, start, len);

The patch looks reasonable, but I'm afraid btrfs_fallocate can be
affected since in btrfs_fallocate(), btrfs_qgroup_reserve_data() takes
'cur_offset' and 'last_byte - cur_offset' which are possible unaligned
to root->sectorsize, but if any errors occur during allocation,
btrfs_qgroup_free_data() in btrfs_free_reserved_data_space() is gonna
free aligned range and it ends up a negative qgroup value.

Thanks,

-liubo

>  }
> 
> 
> -- 
> Jeff Mahoney
> SUSE Labs
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] btrfs: Ensure proper sector alignment for btrfs_free_reserved_data_space
  2016-11-20 15:50 ` Liu Bo
@ 2016-11-20 17:54   ` Jeff Mahoney
  0 siblings, 0 replies; 4+ messages in thread
From: Jeff Mahoney @ 2016-11-20 17:54 UTC (permalink / raw)
  To: bo.li.liu; +Cc: linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 3114 bytes --]

On 11/20/16 10:50 AM, Liu Bo wrote:
> On Fri, Nov 18, 2016 at 09:52:40PM -0500, Jeff Mahoney wrote:
>> From: Jeff Mahoney <jeffm@suse.com>
>> Subject: btrfs: Ensure proper sector alignment for
>>  btrfs_free_reserved_data_space
>> References: bsc#1005666
>> Patch-mainline: Submitted 18 Nov 2016, linux-btrfs
>>
>> This fixes the WARN_ON on BTRFS_I(inode)->reserved_extents in
>> btrfs_destroy_inode and the WARN_ON on nonzero delalloc bytes on umount
>> with qgroups enabled.
>>
>> I was able to reproduce this by setting up a small (~500kb) quota limit
>> and writing a file one byte at a time until I hit the limit.  The warnings
>> would all hit on umount.
>>
>> The root cause is that we would reserve a block-sized range in both
>> the reservation and the quota in btrfs_check_data_free_space, but if we
>> encountered a problem (like e.g. EDQUOT), we would only release the single
>> byte in the qgroup reservation.  That caused an iotree state split, which
>> increased the number of outstanding extents, in turn disallowing releasing
>> the metadata reservation.
>>
>> Signed-off-by: Jeff Mahoney <jeffm@suse.com>
>> ---
>>  fs/btrfs/extent-tree.c |    7 +++++++
>>  1 file changed, 7 insertions(+)
>>
>> --- a/fs/btrfs/extent-tree.c
>> +++ b/fs/btrfs/extent-tree.c
>> @@ -3822,6 +3822,13 @@ void btrfs_free_reserved_data_space_noqu
>>   */
>>  void btrfs_free_reserved_data_space(struct inode *inode, u64 start, u64 len)
>>  {
>> +	struct btrfs_root *root = BTRFS_I(inode)->root;
>> +
>> +	/* Make sure the range is aligned to sectorsize */
>> +	len = round_up(start + len, root->sectorsize) -
>> +	      round_down(start, root->sectorsize);
>> +	start = round_down(start, root->sectorsize);
>> +
>>  	btrfs_free_reserved_data_space_noquota(inode, start, len);
>>  	btrfs_qgroup_free_data(inode, start, len);
> 
> The patch looks reasonable, but I'm afraid btrfs_fallocate can be
> affected since in btrfs_fallocate(), btrfs_qgroup_reserve_data() takes
> 'cur_offset' and 'last_byte - cur_offset' which are possible unaligned
> to root->sectorsize, but if any errors occur during allocation,
> btrfs_qgroup_free_data() in btrfs_free_reserved_data_space() is gonna
> free aligned range and it ends up a negative qgroup value.

Ok, yeah.  I was thinking about this later that evening but hadn't
gotten a chance to dig back into it.  I think the biggest thing is that
the handling of space reservation and qgroups is way too complicated.
It seems nearly impossible for new bugs *not* to sneak in whenever we
touch them.

For this particular bit, though, both cur_offset and last_byte are
sector aligned in btrfs_fallocate, at least in the current mainline
HEAD.  I think fixing up the alignment in the reservation and qgroups
routines is probably the wrong way to do it.  Instead, we should expect
the callers to handle the alignment properly and complain very loudly if
they fail to do that.  I started in on patches to do that after I
submitted the one above, but wanted to get the fix sent first.

-Jeff

-- 
Jeff Mahoney
SUSE Labs


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 827 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] btrfs: Ensure proper sector alignment for btrfs_free_reserved_data_space
  2016-11-19  2:52 [PATCH] btrfs: Ensure proper sector alignment for btrfs_free_reserved_data_space Jeff Mahoney
  2016-11-20 15:50 ` Liu Bo
@ 2016-11-21  0:57 ` Qu Wenruo
  1 sibling, 0 replies; 4+ messages in thread
From: Qu Wenruo @ 2016-11-21  0:57 UTC (permalink / raw)
  To: Jeff Mahoney, linux-btrfs



At 11/19/2016 10:52 AM, Jeff Mahoney wrote:
> From: Jeff Mahoney <jeffm@suse.com>
> Subject: btrfs: Ensure proper sector alignment for
>  btrfs_free_reserved_data_space
> References: bsc#1005666
> Patch-mainline: Submitted 18 Nov 2016, linux-btrfs
>
> This fixes the WARN_ON on BTRFS_I(inode)->reserved_extents in
> btrfs_destroy_inode and the WARN_ON on nonzero delalloc bytes on umount
> with qgroups enabled.
>
> I was able to reproduce this by setting up a small (~500kb) quota limit
> and writing a file one byte at a time until I hit the limit.  The warnings
> would all hit on umount.
>
> The root cause is that we would reserve a block-sized range in both
> the reservation and the quota in btrfs_check_data_free_space, but if we
> encountered a problem (like e.g. EDQUOT), we would only release the single
> byte in the qgroup reservation.

Good catch.

 > That caused an iotree state split, which
> increased the number of outstanding extents, in turn disallowing releasing
> the metadata reservation.
>
> Signed-off-by: Jeff Mahoney <jeffm@suse.com>

Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com>

What about adding more assert/warn_on to reservation/qgroup freeing and 
allocation functions?

It would help much more to detect similar problems.

Thanks,
Qu

> ---
>  fs/btrfs/extent-tree.c |    7 +++++++
>  1 file changed, 7 insertions(+)
>
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -3822,6 +3822,13 @@ void btrfs_free_reserved_data_space_noqu
>   */
>  void btrfs_free_reserved_data_space(struct inode *inode, u64 start, u64 len)
>  {
> +	struct btrfs_root *root = BTRFS_I(inode)->root;
> +
> +	/* Make sure the range is aligned to sectorsize */
> +	len = round_up(start + len, root->sectorsize) -
> +	      round_down(start, root->sectorsize);
> +	start = round_down(start, root->sectorsize);
> +
>  	btrfs_free_reserved_data_space_noquota(inode, start, len);
>  	btrfs_qgroup_free_data(inode, start, len);
>  }
>
>



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-11-21  0:57 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-19  2:52 [PATCH] btrfs: Ensure proper sector alignment for btrfs_free_reserved_data_space Jeff Mahoney
2016-11-20 15:50 ` Liu Bo
2016-11-20 17:54   ` Jeff Mahoney
2016-11-21  0:57 ` Qu Wenruo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.