linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] btrfs: fix btrfs_file_extent_item::ram_bytes of btrfs_split_ordered_extent()
@ 2024-04-17  4:54 Qu Wenruo
  2024-04-17  4:54 ` [PATCH v2 1/2] btrfs: set correct ram_bytes when splitting ordered extent Qu Wenruo
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Qu Wenruo @ 2024-04-17  4:54 UTC (permalink / raw)
  To: linux-btrfs

[CHANGELOG]
v2:
- Update the comment on file extent item tree-checker 
  To be less confusing for future readers.

- Remove one fixes tag of the first patch
  The bug goes back to the introduction of zoned ordered extent
  splitting, thus that oldest commit should be the cause.

During my extent_map members rework, I added a sanity check to make sure
regular non-compressed extent_map would have its disk_num_bytes to match
ram_bytes.

But that extent_map sanity check always fail as we have on-disk file
extent items which has its ram_bytes much larger than the corresponding
disk_num_bytes, even if it's not compressed.

It turns out that, the ram_bytes > disk_num_bytes is caused by
btrfs_split_ordered_extent(), where it doesn't properly update
ram_bytes, resulting it larger than disk_num_bytes.

Thankfully everything is fine, as our code doesn't really bother
ram_bytes for non-compressed regular file extents, so no real damage.

Still I'd like to catch such problem in the future, so add another
tree-checker patch for this case.

And since the invalid ram_bytes is already in the wild for a while, we
do not want to bother the end users to fix their fs for nothing.
So the check is only behind DEBUG builds.

Furthermore, the tree-checker is only to make sure @ram_bytes <
@disk_num_bytes for non-compressed file extents.
As we still have other locations to make @ram_bytes < @disk_num_bytes.

And for btrfs-progs, I'm going to add extra check and repair support
soon.

Qu Wenruo (2):
  btrfs: set correct ram_bytes when splitting ordered extent
  btrfs: tree-checker: add one extra file extent item ram_bytes check

 fs/btrfs/ordered-data.c |  1 +
 fs/btrfs/tree-checker.c | 35 +++++++++++++++++++++++++++++------
 2 files changed, 30 insertions(+), 6 deletions(-)

-- 
2.44.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/2] btrfs: set correct ram_bytes when splitting ordered extent
  2024-04-17  4:54 [PATCH v2 0/2] btrfs: fix btrfs_file_extent_item::ram_bytes of btrfs_split_ordered_extent() Qu Wenruo
@ 2024-04-17  4:54 ` Qu Wenruo
  2024-04-17  4:54 ` [PATCH v2 2/2] btrfs: tree-checker: add one extra file extent item ram_bytes check Qu Wenruo
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Qu Wenruo @ 2024-04-17  4:54 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Filipe Manana

[BUG]
When running generic/287, the following file extent items can be
generated:

        item 16 key (258 EXTENT_DATA 2682880) itemoff 15305 itemsize 53
                generation 9 type 1 (regular)
                extent data disk byte 1378414592 nr 462848
                extent data offset 0 nr 462848 ram 2097152
                extent compression 0 (none)

Note that file extent item is not a compressed one, but its ram_bytes is
way larger than its disk_num_bytes.

According to btrfs on-disk scheme, ram_bytes should match disk_num_bytes
if it's not a compressed one.

[CAUSE]
Since commit b73a6fd1b1ef ("btrfs: split partial dio bios before
submit"), for partial dio writes, we would split the ordered extent.

However the function btrfs_split_ordered_extent() doesn't update the
ram_bytes even it has already shrunk the disk_num_bytes.

Originally the function btrfs_split_ordered_extent() is only introduced
for zoned devices in commit d22002fd37bd ("btrfs: zoned: split ordered
extent when bio is sent"), but later commit b73a6fd1b1ef ("btrfs: split
partial dio bios before submit") makes non-zoned btrfs affected.

Thankfully for un-compressed file extent, we do not really utilize the
ram_bytes member, thus it won't cause any real problem.

[FIX]
Also update btrfs_ordered_extent::ram_bytes inside
btrfs_split_ordered_extent().

Fixes: d22002fd37bd ("btrfs: zoned: split ordered extent when bio is sent")
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/ordered-data.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
index b749ba45da2b..c2a42bcde98e 100644
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -1188,6 +1188,7 @@ struct btrfs_ordered_extent *btrfs_split_ordered_extent(
 	ordered->disk_bytenr += len;
 	ordered->num_bytes -= len;
 	ordered->disk_num_bytes -= len;
+	ordered->ram_bytes -= len;
 
 	if (test_bit(BTRFS_ORDERED_IO_DONE, &ordered->flags)) {
 		ASSERT(ordered->bytes_left == 0);
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/2] btrfs: tree-checker: add one extra file extent item ram_bytes check
  2024-04-17  4:54 [PATCH v2 0/2] btrfs: fix btrfs_file_extent_item::ram_bytes of btrfs_split_ordered_extent() Qu Wenruo
  2024-04-17  4:54 ` [PATCH v2 1/2] btrfs: set correct ram_bytes when splitting ordered extent Qu Wenruo
@ 2024-04-17  4:54 ` Qu Wenruo
  2024-04-17  9:06 ` [PATCH v2 0/2] btrfs: fix btrfs_file_extent_item::ram_bytes of btrfs_split_ordered_extent() Johannes Thumshirn
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Qu Wenruo @ 2024-04-17  4:54 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Filipe Manana

During my development on extent map cleanups, I hit a case where we can
create a file extent item that has ram_bytes double the size of
num_bytes but it's not compressed.

Later it turns out to be a bug in btrfs_split_ordered_extent(), and
thankfully it doesn't cause any real corruption, just a drift from
on-disk format.

Here we add an extra check on ram_bytes for btrfs_file_extent_item to
catch such problem.

However considering the incorrect ram_bytes are already in the wild, and
no real data corruption, we do not want end users to be bothered as their
data is still consistent.

So this patch would only hide the check behind DEBUG builds for us
developers to catch future problem.

Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/tree-checker.c | 35 +++++++++++++++++++++++++++++------
 1 file changed, 29 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index c8fbcae4e88e..9f1597fe40e7 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -212,6 +212,7 @@ static int check_extent_data_item(struct extent_buffer *leaf,
 	u32 sectorsize = fs_info->sectorsize;
 	u32 item_size = btrfs_item_size(leaf, slot);
 	u64 extent_end;
+	u8 compression;
 
 	if (unlikely(!IS_ALIGNED(key->offset, sectorsize))) {
 		file_extent_err(leaf, slot,
@@ -251,16 +252,15 @@ static int check_extent_data_item(struct extent_buffer *leaf,
 		return -EUCLEAN;
 	}
 
+	compression = btrfs_file_extent_compression(leaf, fi);
 	/*
 	 * Support for new compression/encryption must introduce incompat flag,
 	 * and must be caught in open_ctree().
 	 */
-	if (unlikely(btrfs_file_extent_compression(leaf, fi) >=
-		     BTRFS_NR_COMPRESS_TYPES)) {
+	if (unlikely(compression >= BTRFS_NR_COMPRESS_TYPES)) {
 		file_extent_err(leaf, slot,
 	"invalid compression for file extent, have %u expect range [0, %u]",
-			btrfs_file_extent_compression(leaf, fi),
-			BTRFS_NR_COMPRESS_TYPES - 1);
+			compression, BTRFS_NR_COMPRESS_TYPES - 1);
 		return -EUCLEAN;
 	}
 	if (unlikely(btrfs_file_extent_encryption(leaf, fi))) {
@@ -279,8 +279,7 @@ static int check_extent_data_item(struct extent_buffer *leaf,
 		}
 
 		/* Compressed inline extent has no on-disk size, skip it */
-		if (btrfs_file_extent_compression(leaf, fi) !=
-		    BTRFS_COMPRESS_NONE)
+		if (compression != BTRFS_COMPRESS_NONE)
 			return 0;
 
 		/* Uncompressed inline extent size must match item size */
@@ -319,6 +318,30 @@ static int check_extent_data_item(struct extent_buffer *leaf,
 		return -EUCLEAN;
 	}
 
+	/*
+	 * If it's a uncompressed regular extents, its ram size should match
+	 * disk_num_bytes. But for now we have several call sites that doesn't
+	 * properly update @ram_bytes, so at least make sure
+	 * @ram_bytes <= @disk_num_bytes.
+	 *
+	 * However we had a bug related to @ram_bytes update, causing
+	 * all zoned and regular DIO to be affected.
+	 * Thankfully the ram_bytes is not critical for non-compressed file extents.
+	 * So here we hide the check behind DEBUG builds for developers only.
+	 */
+#ifdef CONFIG_BTRFS_DEBUG
+	if (unlikely(compression == BTRFS_COMPRESS_NONE &&
+		     btrfs_file_extent_disk_bytenr(leaf, fi) &&
+		     btrfs_file_extent_ram_bytes(leaf, fi) >
+		     btrfs_file_extent_disk_num_bytes(leaf, fi))) {
+		file_extent_err(leaf, slot,
+				"invalid ram_bytes, have %llu expect <= %llu",
+				btrfs_file_extent_ram_bytes(leaf, fi),
+				btrfs_file_extent_disk_num_bytes(leaf, fi));
+		return -EUCLEAN;
+	}
+#endif
+
 	/*
 	 * Check that no two consecutive file extent items, in the same leaf,
 	 * present ranges that overlap each other.
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 0/2] btrfs: fix btrfs_file_extent_item::ram_bytes of btrfs_split_ordered_extent()
  2024-04-17  4:54 [PATCH v2 0/2] btrfs: fix btrfs_file_extent_item::ram_bytes of btrfs_split_ordered_extent() Qu Wenruo
  2024-04-17  4:54 ` [PATCH v2 1/2] btrfs: set correct ram_bytes when splitting ordered extent Qu Wenruo
  2024-04-17  4:54 ` [PATCH v2 2/2] btrfs: tree-checker: add one extra file extent item ram_bytes check Qu Wenruo
@ 2024-04-17  9:06 ` Johannes Thumshirn
  2024-04-19 17:29 ` David Sterba
  2024-04-23 14:17 ` David Sterba
  4 siblings, 0 replies; 8+ messages in thread
From: Johannes Thumshirn @ 2024-04-17  9:06 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 0/2] btrfs: fix btrfs_file_extent_item::ram_bytes of btrfs_split_ordered_extent()
  2024-04-17  4:54 [PATCH v2 0/2] btrfs: fix btrfs_file_extent_item::ram_bytes of btrfs_split_ordered_extent() Qu Wenruo
                   ` (2 preceding siblings ...)
  2024-04-17  9:06 ` [PATCH v2 0/2] btrfs: fix btrfs_file_extent_item::ram_bytes of btrfs_split_ordered_extent() Johannes Thumshirn
@ 2024-04-19 17:29 ` David Sterba
  2024-04-19 22:00   ` Qu Wenruo
  2024-04-23 14:17 ` David Sterba
  4 siblings, 1 reply; 8+ messages in thread
From: David Sterba @ 2024-04-19 17:29 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Wed, Apr 17, 2024 at 02:24:37PM +0930, Qu Wenruo wrote:
> [CHANGELOG]
> v2:
> - Update the comment on file extent item tree-checker 
>   To be less confusing for future readers.
> 
> - Remove one fixes tag of the first patch
>   The bug goes back to the introduction of zoned ordered extent
>   splitting, thus that oldest commit should be the cause.
> 
> During my extent_map members rework, I added a sanity check to make sure
> regular non-compressed extent_map would have its disk_num_bytes to match
> ram_bytes.
> 
> But that extent_map sanity check always fail as we have on-disk file
> extent items which has its ram_bytes much larger than the corresponding
> disk_num_bytes, even if it's not compressed.
> 
> It turns out that, the ram_bytes > disk_num_bytes is caused by
> btrfs_split_ordered_extent(), where it doesn't properly update
> ram_bytes, resulting it larger than disk_num_bytes.
> 
> Thankfully everything is fine, as our code doesn't really bother
> ram_bytes for non-compressed regular file extents, so no real damage.
> 
> Still I'd like to catch such problem in the future, so add another
> tree-checker patch for this case.
> 
> And since the invalid ram_bytes is already in the wild for a while, we
> do not want to bother the end users to fix their fs for nothing.
> So the check is only behind DEBUG builds.

For testing this is OK, but would it make sense to fix the wrong values
automatically?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 0/2] btrfs: fix btrfs_file_extent_item::ram_bytes of btrfs_split_ordered_extent()
  2024-04-19 17:29 ` David Sterba
@ 2024-04-19 22:00   ` Qu Wenruo
  2024-04-19 22:44     ` David Sterba
  0 siblings, 1 reply; 8+ messages in thread
From: Qu Wenruo @ 2024-04-19 22:00 UTC (permalink / raw)
  To: dsterba, Qu Wenruo; +Cc: linux-btrfs



在 2024/4/20 02:59, David Sterba 写道:
> On Wed, Apr 17, 2024 at 02:24:37PM +0930, Qu Wenruo wrote:
>> [CHANGELOG]
>> v2:
>> - Update the comment on file extent item tree-checker
>>    To be less confusing for future readers.
>>
>> - Remove one fixes tag of the first patch
>>    The bug goes back to the introduction of zoned ordered extent
>>    splitting, thus that oldest commit should be the cause.
>>
>> During my extent_map members rework, I added a sanity check to make sure
>> regular non-compressed extent_map would have its disk_num_bytes to match
>> ram_bytes.
>>
>> But that extent_map sanity check always fail as we have on-disk file
>> extent items which has its ram_bytes much larger than the corresponding
>> disk_num_bytes, even if it's not compressed.
>>
>> It turns out that, the ram_bytes > disk_num_bytes is caused by
>> btrfs_split_ordered_extent(), where it doesn't properly update
>> ram_bytes, resulting it larger than disk_num_bytes.
>>
>> Thankfully everything is fine, as our code doesn't really bother
>> ram_bytes for non-compressed regular file extents, so no real damage.
>>
>> Still I'd like to catch such problem in the future, so add another
>> tree-checker patch for this case.
>>
>> And since the invalid ram_bytes is already in the wild for a while, we
>> do not want to bother the end users to fix their fs for nothing.
>> So the check is only behind DEBUG builds.
>
> For testing this is OK, but would it make sense to fix the wrong values
> automatically?
>

I'm also thinking about this, two alternatives, but neither is perfect:

- Fix the value at eb level, aka, updates the ram_bytes directly at read
   time
   This should fix the problem forever, but it has its own problems
   related to read-only mounts.
   If we do not do the fix for RO mounts, then remounting to RW, and we
   would have cached result untouched.
   But if we do, we're modifying ebs, how to write them back for a full
   RO mounts (e.g. rescue=all)?

- Only fix the extent_map::ram_bytes value
   This is not going to change anything on-disk.

And considering this "problem" has no real chance for corruption, I do
not believe we need all those hassles.

My plan to repair the mismatch would all be inside btrfs-progs, after I
have pinned down other call sites resulting smaller ram_bytes than
disk_num_bytes.

Thanks,
Qu

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 0/2] btrfs: fix btrfs_file_extent_item::ram_bytes of btrfs_split_ordered_extent()
  2024-04-19 22:00   ` Qu Wenruo
@ 2024-04-19 22:44     ` David Sterba
  0 siblings, 0 replies; 8+ messages in thread
From: David Sterba @ 2024-04-19 22:44 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: dsterba, Qu Wenruo, linux-btrfs

On Sat, Apr 20, 2024 at 07:30:39AM +0930, Qu Wenruo wrote:
> >> And since the invalid ram_bytes is already in the wild for a while, we
> >> do not want to bother the end users to fix their fs for nothing.
> >> So the check is only behind DEBUG builds.
> >
> > For testing this is OK, but would it make sense to fix the wrong values
> > automatically?
> >
> 
> I'm also thinking about this, two alternatives, but neither is perfect:
> 
> - Fix the value at eb level, aka, updates the ram_bytes directly at read
>    time
>    This should fix the problem forever, but it has its own problems
>    related to read-only mounts.
>    If we do not do the fix for RO mounts, then remounting to RW, and we
>    would have cached result untouched.
>    But if we do, we're modifying ebs, how to write them back for a full
>    RO mounts (e.g. rescue=all)?
> 
> - Only fix the extent_map::ram_bytes value
>    This is not going to change anything on-disk.
> 
> And considering this "problem" has no real chance for corruption, I do
> not believe we need all those hassles.
> 
> My plan to repair the mismatch would all be inside btrfs-progs, after I
> have pinned down other call sites resulting smaller ram_bytes than
> disk_num_bytes.

I see, so the only chance to fix it on disk is to change the extent for
some other reason. Otherwise we can only make sure that an in-memory
value is correct so it does not propagate further.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 0/2] btrfs: fix btrfs_file_extent_item::ram_bytes of btrfs_split_ordered_extent()
  2024-04-17  4:54 [PATCH v2 0/2] btrfs: fix btrfs_file_extent_item::ram_bytes of btrfs_split_ordered_extent() Qu Wenruo
                   ` (3 preceding siblings ...)
  2024-04-19 17:29 ` David Sterba
@ 2024-04-23 14:17 ` David Sterba
  4 siblings, 0 replies; 8+ messages in thread
From: David Sterba @ 2024-04-23 14:17 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Wed, Apr 17, 2024 at 02:24:37PM +0930, Qu Wenruo wrote:
> [CHANGELOG]
> v2:
> - Update the comment on file extent item tree-checker 
>   To be less confusing for future readers.
> 
> - Remove one fixes tag of the first patch
>   The bug goes back to the introduction of zoned ordered extent
>   splitting, thus that oldest commit should be the cause.
> 
> During my extent_map members rework, I added a sanity check to make sure
> regular non-compressed extent_map would have its disk_num_bytes to match
> ram_bytes.
> 
> But that extent_map sanity check always fail as we have on-disk file
> extent items which has its ram_bytes much larger than the corresponding
> disk_num_bytes, even if it's not compressed.
> 
> It turns out that, the ram_bytes > disk_num_bytes is caused by
> btrfs_split_ordered_extent(), where it doesn't properly update
> ram_bytes, resulting it larger than disk_num_bytes.
> 
> Thankfully everything is fine, as our code doesn't really bother
> ram_bytes for non-compressed regular file extents, so no real damage.
> 
> Still I'd like to catch such problem in the future, so add another
> tree-checker patch for this case.
> 
> And since the invalid ram_bytes is already in the wild for a while, we
> do not want to bother the end users to fix their fs for nothing.
> So the check is only behind DEBUG builds.
> 
> Furthermore, the tree-checker is only to make sure @ram_bytes <
> @disk_num_bytes for non-compressed file extents.
> As we still have other locations to make @ram_bytes < @disk_num_bytes.
> 
> And for btrfs-progs, I'm going to add extra check and repair support
> soon.
> 
> Qu Wenruo (2):
>   btrfs: set correct ram_bytes when splitting ordered extent
>   btrfs: tree-checker: add one extra file extent item ram_bytes check

Reviewed-by: David Sterba <dsterba@suse.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-04-23 14:24 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-17  4:54 [PATCH v2 0/2] btrfs: fix btrfs_file_extent_item::ram_bytes of btrfs_split_ordered_extent() Qu Wenruo
2024-04-17  4:54 ` [PATCH v2 1/2] btrfs: set correct ram_bytes when splitting ordered extent Qu Wenruo
2024-04-17  4:54 ` [PATCH v2 2/2] btrfs: tree-checker: add one extra file extent item ram_bytes check Qu Wenruo
2024-04-17  9:06 ` [PATCH v2 0/2] btrfs: fix btrfs_file_extent_item::ram_bytes of btrfs_split_ordered_extent() Johannes Thumshirn
2024-04-19 17:29 ` David Sterba
2024-04-19 22:00   ` Qu Wenruo
2024-04-19 22:44     ` David Sterba
2024-04-23 14:17 ` David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).