linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] btrfs: fix a potential hole-punching failure
@ 2021-03-24  9:31 bingjingc
  2021-03-24 11:30 ` Filipe Manana
  0 siblings, 1 reply; 3+ messages in thread
From: bingjingc @ 2021-03-24  9:31 UTC (permalink / raw)
  To: josef, dsterba, quwenruo, clm, linux-btrfs, linux-kernel
  Cc: bingjingc, cccheng, robbieko

From: BingJing Chang <bingjingc@synology.com>

In commit d77815461f04 ("btrfs: Avoid trucating page or punching hole in
a already existed hole."), existed holes can be skipped by calling
find_first_non_hole() to adjust *start and *len. However, if the given
len is invalid and large, when an EXTENT_MAP_HOLE extent is found, the
*len will not be set to zero because (em->start + em->len) is less than
(*start + *len). Then the ret will be 1 but the *len will not be set to
0. The propagated non-zero ret will result in fallocate failure.

In the while-loop of btrfs_replace_file_extents(), len is not updated
every time before it calls find_first_non_hole(). That is, if the last
file extent in the given hole-punching range has been dropped but
btrfs_drop_extents() fails with -ENOSPC (btrfs_drop_extents() runs out
of reserved space of the given transaction), the problem can happen.
After it calls find_first_non_hole(), the cur_offset will be adjusted
to be larger than or equal to end. However, since the len is not set to
zero. The break-loop condition (ret && !len) will not meet. After it
leaves the while-loop, uncleared ret will result in fallocate failure.

We're not able to construct a reproducible way to let
btrfs_drop_extents() fails with -ENOSPC after it drops the last file
extent but with remaining holes. However, it's quite easy to fix. We
just need to update and check the len every time before we call
find_first_non_hole(). To make the while loop more readable, we also
pull the variable updates to the bottom of loop like this:
while (cur_offset < end) {
        ...
        // update cur_offset & len
        // advance cur_offset & len in hole-punching case if needed
}

Reported-by: Robbie Ko <robbieko@synology.com>
Fixes: d77815461f04 ("btrfs: Avoid trucating page or punching hole in a
already existed hole.")
Reviewed-by: Robbie Ko <robbieko@synology.com>
Reviewed-by: Chung-Chiang Cheng <cccheng@synology.com>
Signed-off-by: BingJing Chang <bingjingc@synology.com>
---
 fs/btrfs/file.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 0e155f0..dccb017 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -2735,8 +2735,6 @@ int btrfs_replace_file_extents(struct inode *inode, struct btrfs_path *path,
 			extent_info->file_offset += replace_len;
 		}
 
-		cur_offset = drop_args.drop_end;
-
 		ret = btrfs_update_inode(trans, root, BTRFS_I(inode));
 		if (ret)
 			break;
@@ -2756,7 +2754,9 @@ int btrfs_replace_file_extents(struct inode *inode, struct btrfs_path *path,
 		BUG_ON(ret);	/* shouldn't happen */
 		trans->block_rsv = rsv;
 
-		if (!extent_info) {
+		cur_offset = drop_args.drop_end;
+		len = end - cur_offset;
+		if (!extent_info && len) {
 			ret = find_first_non_hole(BTRFS_I(inode), &cur_offset,
 						  &len);
 			if (unlikely(ret < 0))
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] btrfs: fix a potential hole-punching failure
  2021-03-24  9:31 [PATCH] btrfs: fix a potential hole-punching failure bingjingc
@ 2021-03-24 11:30 ` Filipe Manana
  2021-03-25  2:06   ` bingjing chang
  0 siblings, 1 reply; 3+ messages in thread
From: Filipe Manana @ 2021-03-24 11:30 UTC (permalink / raw)
  To: bingjingc
  Cc: Josef Bacik, David Sterba, Qu Wenruo, Chris Mason, linux-btrfs,
	Linux Kernel Mailing List, cccheng, Robbie Ko

On Wed, Mar 24, 2021 at 11:15 AM bingjingc <bingjingc@synology.com> wrote:
>
> From: BingJing Chang <bingjingc@synology.com>
>
> In commit d77815461f04 ("btrfs: Avoid trucating page or punching hole in
> a already existed hole."), existed holes can be skipped by calling
> find_first_non_hole() to adjust *start and *len. However, if the given
> len is invalid and large, when an EXTENT_MAP_HOLE extent is found, the
> *len will not be set to zero because (em->start + em->len) is less than
> (*start + *len). Then the ret will be 1 but the *len will not be set to
> 0. The propagated non-zero ret will result in fallocate failure.
>
> In the while-loop of btrfs_replace_file_extents(), len is not updated
> every time before it calls find_first_non_hole(). That is, if the last
> file extent in the given hole-punching range has been dropped but
> btrfs_drop_extents() fails with -ENOSPC (btrfs_drop_extents() runs out
> of reserved space of the given transaction), the problem can happen.

This is not entirely clear. Dropping the last extent and still
returning ENOSPC is confusing.
I think you mean that it drops the last file extent item that does not
represent hole (disk_bytenr > 0), and after it there's only one file
extent item representing a hole (disk_bytenr == 0).
It fails with -ENOSPC when attempting to drop the file extent item
representing the hole, after successfully dropping the non-hole file
extent item.
Is that it?

> After it calls find_first_non_hole(), the cur_offset will be adjusted
> to be larger than or equal to end. However, since the len is not set to
> zero. The break-loop condition (ret && !len) will not meet. After it
> leaves the while-loop, uncleared ret will result in fallocate failure.

Ok, fallocate will return 1, an unexpected return value.

>
> We're not able to construct a reproducible way to let
> btrfs_drop_extents() fails with -ENOSPC after it drops the last file
> extent but with remaining holes. However, it's quite easy to fix. We
> just need to update and check the len every time before we call
> find_first_non_hole(). To make the while loop more readable, we also
> pull the variable updates to the bottom of loop like this:
> while (cur_offset < end) {
>         ...
>         // update cur_offset & len
>         // advance cur_offset & len in hole-punching case if needed
> }
>
> Reported-by: Robbie Ko <robbieko@synology.com>
> Fixes: d77815461f04 ("btrfs: Avoid trucating page or punching hole in a
> already existed hole.")
> Reviewed-by: Robbie Ko <robbieko@synology.com>
> Reviewed-by: Chung-Chiang Cheng <cccheng@synology.com>
> Signed-off-by: BingJing Chang <bingjingc@synology.com>

Looks good.
Please just update that paragraph to be more clear about what is going on.

Thanks.

> ---
>  fs/btrfs/file.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index 0e155f0..dccb017 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -2735,8 +2735,6 @@ int btrfs_replace_file_extents(struct inode *inode, struct btrfs_path *path,
>                         extent_info->file_offset += replace_len;
>                 }
>
> -               cur_offset = drop_args.drop_end;
> -
>                 ret = btrfs_update_inode(trans, root, BTRFS_I(inode));
>                 if (ret)
>                         break;
> @@ -2756,7 +2754,9 @@ int btrfs_replace_file_extents(struct inode *inode, struct btrfs_path *path,
>                 BUG_ON(ret);    /* shouldn't happen */
>                 trans->block_rsv = rsv;
>
> -               if (!extent_info) {
> +               cur_offset = drop_args.drop_end;
> +               len = end - cur_offset;
> +               if (!extent_info && len) {
>                         ret = find_first_non_hole(BTRFS_I(inode), &cur_offset,
>                                                   &len);
>                         if (unlikely(ret < 0))
> --
> 2.7.4
>


-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] btrfs: fix a potential hole-punching failure
  2021-03-24 11:30 ` Filipe Manana
@ 2021-03-25  2:06   ` bingjing chang
  0 siblings, 0 replies; 3+ messages in thread
From: bingjing chang @ 2021-03-25  2:06 UTC (permalink / raw)
  To: fdmanana
  Cc: bingjingc, Josef Bacik, David Sterba, Qu Wenruo, Chris Mason,
	linux-btrfs, Linux Kernel Mailing List, cccheng, Robbie Ko

In order to reply in plain text, I send the mail from Gmail.

Filipe Manana <fdmanana@gmail.com> 於 2021年3月24日 週三 下午8:16寫道:
>
> On Wed, Mar 24, 2021 at 11:15 AM bingjingc <bingjingc@synology.com> wrote:
> >
> > From: BingJing Chang <bingjingc@synology.com>
> >
> > In commit d77815461f04 ("btrfs: Avoid trucating page or punching hole in
> > a already existed hole."), existed holes can be skipped by calling
> > find_first_non_hole() to adjust *start and *len. However, if the given
> > len is invalid and large, when an EXTENT_MAP_HOLE extent is found, the
> > *len will not be set to zero because (em->start + em->len) is less than
> > (*start + *len). Then the ret will be 1 but the *len will not be set to
> > 0. The propagated non-zero ret will result in fallocate failure.
> >
> > In the while-loop of btrfs_replace_file_extents(), len is not updated
> > every time before it calls find_first_non_hole(). That is, if the last
> > file extent in the given hole-punching range has been dropped but
> > btrfs_drop_extents() fails with -ENOSPC (btrfs_drop_extents() runs out
> > of reserved space of the given transaction), the problem can happen.
>
> This is not entirely clear. Dropping the last extent and still
> returning ENOSPC is confusing.
> I think you mean that it drops the last file extent item that does not
> represent hole (disk_bytenr > 0), and after it there's only one file
> extent item representing a hole (disk_bytenr == 0).
> It fails with -ENOSPC when attempting to drop the file extent item
> representing the hole, after successfully dropping the non-hole file
> extent item.
> Is that it?
>

Thank you for your comments. You're right.
Saying the last file extent is not correct and confusing.
I revised and send the v2 patch for fixing the commit message. Thank you.

> > After it calls find_first_non_hole(), the cur_offset will be adjusted
> > to be larger than or equal to end. However, since the len is not set to
> > zero. The break-loop condition (ret && !len) will not meet. After it
> > leaves the while-loop, uncleared ret will result in fallocate failure.
>
> Ok, fallocate will return 1, an unexpected return value.
>
> >
> > We're not able to construct a reproducible way to let
> > btrfs_drop_extents() fails with -ENOSPC after it drops the last file
> > extent but with remaining holes. However, it's quite easy to fix. We
> > just need to update and check the len every time before we call
> > find_first_non_hole(). To make the while loop more readable, we also
> > pull the variable updates to the bottom of loop like this:
> > while (cur_offset < end) {
> >         ...
> >         // update cur_offset & len
> >         // advance cur_offset & len in hole-punching case if needed
> > }
> >
> > Reported-by: Robbie Ko <robbieko@synology.com>
> > Fixes: d77815461f04 ("btrfs: Avoid trucating page or punching hole in a
> > already existed hole.")
> > Reviewed-by: Robbie Ko <robbieko@synology.com>
> > Reviewed-by: Chung-Chiang Cheng <cccheng@synology.com>
> > Signed-off-by: BingJing Chang <bingjingc@synology.com>
>
> Looks good.
> Please just update that paragraph to be more clear about what is going on.
>
> Thanks.
>
> > ---
> >  fs/btrfs/file.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> > index 0e155f0..dccb017 100644
> > --- a/fs/btrfs/file.c
> > +++ b/fs/btrfs/file.c
> > @@ -2735,8 +2735,6 @@ int btrfs_replace_file_extents(struct inode *inode, struct btrfs_path *path,
> >                         extent_info->file_offset += replace_len;
> >                 }
> >
> > -               cur_offset = drop_args.drop_end;
> > -
> >                 ret = btrfs_update_inode(trans, root, BTRFS_I(inode));
> >                 if (ret)
> >                         break;
> > @@ -2756,7 +2754,9 @@ int btrfs_replace_file_extents(struct inode *inode, struct btrfs_path *path,
> >                 BUG_ON(ret);    /* shouldn't happen */
> >                 trans->block_rsv = rsv;
> >
> > -               if (!extent_info) {
> > +               cur_offset = drop_args.drop_end;
> > +               len = end - cur_offset;
> > +               if (!extent_info && len) {
> >                         ret = find_first_non_hole(BTRFS_I(inode), &cur_offset,
> >                                                   &len);
> >                         if (unlikely(ret < 0))
> > --
> > 2.7.4
> >
>
>
> --
> Filipe David Manana,
>
> “Whether you think you can, or you think you can't — you're right.”

Thanks,
BingJing Chang

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-03-25  2:07 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-24  9:31 [PATCH] btrfs: fix a potential hole-punching failure bingjingc
2021-03-24 11:30 ` Filipe Manana
2021-03-25  2:06   ` bingjing chang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).