All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] btrfs: do not error out if the extent ref hash doesn't match
@ 2021-02-16 20:43 Josef Bacik
  2021-02-17 10:56 ` Filipe Manana
  2021-02-17 17:08 ` David Sterba
  0 siblings, 2 replies; 3+ messages in thread
From: Josef Bacik @ 2021-02-16 20:43 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Tuomas Lähdekorpi

The tree checker checks the extent ref hash at read and write time to
make sure we do not corrupt the file system.  Generally extent
references go inline, but if we have enough of them we need to make an
item, which looks like

key.objectid	= <bytenr>
key.type	= <BTRFS_EXTENT_DATA_REF_KEY|BTRFS_TREE_BLOCK_REF_KEY>
key.offset	= hash(tree, owner, offset)

However if key.offset collide with an unrelated extent reference we'll
simply key.offset++ until we get something that doesn't collide.
Obviously this doesn't match at tree checker time, and thus we error
while writing out the transaction.  This is relatively easy to
reproduce, simply do something like the following

xfs_io -f -c "pwrite 0 1M" file
offset=2

for i in {0..10000}
do
	xfs_io -c "reflink file 0 ${offset}M 1M" file
	offset=$(( offset + 2 ))
done

xfs_io -c "reflink file 0 17999258914816 1M" file
xfs_io -c "reflink file 0 35998517829632 1M" file
xfs_io -c "reflink file 0 53752752058368 1M" file

btrfs filesystem sync

And the sync will error out because we'll abort the transaction.  The
magic values above are used because they generate hash collisions with
the first file in the main subvol.

The fix for this is to remove the hash value check from tree checker, as
we have no idea which offset ours should belong to.

Reported-by: Tuomas Lähdekorpi <tuomas.lahdekorpi@gmail.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/tree-checker.c | 12 ------------
 1 file changed, 12 deletions(-)

diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index 582061c7b547..2429d668db46 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -1453,22 +1453,10 @@ static int check_extent_data_ref(struct extent_buffer *leaf,
 		return -EUCLEAN;
 	}
 	for (; ptr < end; ptr += sizeof(*dref)) {
-		u64 root_objectid;
-		u64 owner;
 		u64 offset;
-		u64 hash;
 
 		dref = (struct btrfs_extent_data_ref *)ptr;
-		root_objectid = btrfs_extent_data_ref_root(leaf, dref);
-		owner = btrfs_extent_data_ref_objectid(leaf, dref);
 		offset = btrfs_extent_data_ref_offset(leaf, dref);
-		hash = hash_extent_data_ref(root_objectid, owner, offset);
-		if (unlikely(hash != key->offset)) {
-			extent_err(leaf, slot,
-	"invalid extent data ref hash, item has 0x%016llx key has 0x%016llx",
-				   hash, key->offset);
-			return -EUCLEAN;
-		}
 		if (unlikely(!IS_ALIGNED(offset, leaf->fs_info->sectorsize))) {
 			extent_err(leaf, slot,
 	"invalid extent data backref offset, have %llu expect aligned to %u",
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] btrfs: do not error out if the extent ref hash doesn't match
  2021-02-16 20:43 [PATCH] btrfs: do not error out if the extent ref hash doesn't match Josef Bacik
@ 2021-02-17 10:56 ` Filipe Manana
  2021-02-17 17:08 ` David Sterba
  1 sibling, 0 replies; 3+ messages in thread
From: Filipe Manana @ 2021-02-17 10:56 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team, Tuomas Lähdekorpi

On Tue, Feb 16, 2021 at 8:46 PM Josef Bacik <josef@toxicpanda.com> wrote:
>
> The tree checker checks the extent ref hash at read and write time to
> make sure we do not corrupt the file system.  Generally extent
> references go inline, but if we have enough of them we need to make an
> item, which looks like
>
> key.objectid    = <bytenr>
> key.type        = <BTRFS_EXTENT_DATA_REF_KEY|BTRFS_TREE_BLOCK_REF_KEY>
> key.offset      = hash(tree, owner, offset)
>
> However if key.offset collide with an unrelated extent reference we'll
> simply key.offset++ until we get something that doesn't collide.
> Obviously this doesn't match at tree checker time, and thus we error
> while writing out the transaction.  This is relatively easy to
> reproduce, simply do something like the following
>
> xfs_io -f -c "pwrite 0 1M" file
> offset=2
>
> for i in {0..10000}
> do
>         xfs_io -c "reflink file 0 ${offset}M 1M" file
>         offset=$(( offset + 2 ))
> done
>
> xfs_io -c "reflink file 0 17999258914816 1M" file
> xfs_io -c "reflink file 0 35998517829632 1M" file
> xfs_io -c "reflink file 0 53752752058368 1M" file
>
> btrfs filesystem sync
>
> And the sync will error out because we'll abort the transaction.  The
> magic values above are used because they generate hash collisions with
> the first file in the main subvol.
>
> The fix for this is to remove the hash value check from tree checker, as
> we have no idea which offset ours should belong to.
>
> Reported-by: Tuomas Lähdekorpi <tuomas.lahdekorpi@gmail.com>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Filipe Manana <fdmanana@suse.com>

Looks good, thanks.

> ---
>  fs/btrfs/tree-checker.c | 12 ------------
>  1 file changed, 12 deletions(-)
>
> diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
> index 582061c7b547..2429d668db46 100644
> --- a/fs/btrfs/tree-checker.c
> +++ b/fs/btrfs/tree-checker.c
> @@ -1453,22 +1453,10 @@ static int check_extent_data_ref(struct extent_buffer *leaf,
>                 return -EUCLEAN;
>         }
>         for (; ptr < end; ptr += sizeof(*dref)) {
> -               u64 root_objectid;
> -               u64 owner;
>                 u64 offset;
> -               u64 hash;
>
>                 dref = (struct btrfs_extent_data_ref *)ptr;
> -               root_objectid = btrfs_extent_data_ref_root(leaf, dref);
> -               owner = btrfs_extent_data_ref_objectid(leaf, dref);
>                 offset = btrfs_extent_data_ref_offset(leaf, dref);
> -               hash = hash_extent_data_ref(root_objectid, owner, offset);
> -               if (unlikely(hash != key->offset)) {
> -                       extent_err(leaf, slot,
> -       "invalid extent data ref hash, item has 0x%016llx key has 0x%016llx",
> -                                  hash, key->offset);
> -                       return -EUCLEAN;
> -               }
>                 if (unlikely(!IS_ALIGNED(offset, leaf->fs_info->sectorsize))) {
>                         extent_err(leaf, slot,
>         "invalid extent data backref offset, have %llu expect aligned to %u",
> --
> 2.26.2
>


-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] btrfs: do not error out if the extent ref hash doesn't match
  2021-02-16 20:43 [PATCH] btrfs: do not error out if the extent ref hash doesn't match Josef Bacik
  2021-02-17 10:56 ` Filipe Manana
@ 2021-02-17 17:08 ` David Sterba
  1 sibling, 0 replies; 3+ messages in thread
From: David Sterba @ 2021-02-17 17:08 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team, Tuomas Lähdekorpi

On Tue, Feb 16, 2021 at 03:43:22PM -0500, Josef Bacik wrote:
> The tree checker checks the extent ref hash at read and write time to
> make sure we do not corrupt the file system.  Generally extent
> references go inline, but if we have enough of them we need to make an
> item, which looks like
> 
> key.objectid	= <bytenr>
> key.type	= <BTRFS_EXTENT_DATA_REF_KEY|BTRFS_TREE_BLOCK_REF_KEY>
> key.offset	= hash(tree, owner, offset)
> 
> However if key.offset collide with an unrelated extent reference we'll
> simply key.offset++ until we get something that doesn't collide.
> Obviously this doesn't match at tree checker time, and thus we error
> while writing out the transaction.  This is relatively easy to
> reproduce, simply do something like the following
> 
> xfs_io -f -c "pwrite 0 1M" file
> offset=2
> 
> for i in {0..10000}
> do
> 	xfs_io -c "reflink file 0 ${offset}M 1M" file
> 	offset=$(( offset + 2 ))
> done
> 
> xfs_io -c "reflink file 0 17999258914816 1M" file
> xfs_io -c "reflink file 0 35998517829632 1M" file
> xfs_io -c "reflink file 0 53752752058368 1M" file
> 
> btrfs filesystem sync
> 
> And the sync will error out because we'll abort the transaction.  The
> magic values above are used because they generate hash collisions with
> the first file in the main subvol.
> 
> The fix for this is to remove the hash value check from tree checker, as
> we have no idea which offset ours should belong to.
> 
> Reported-by: Tuomas Lähdekorpi <tuomas.lahdekorpi@gmail.com>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Thanks, I've added a comment stating that we can't check the hash there.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-02-17 17:10 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-16 20:43 [PATCH] btrfs: do not error out if the extent ref hash doesn't match Josef Bacik
2021-02-17 10:56 ` Filipe Manana
2021-02-17 17:08 ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.