linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] btrfs: fixes for commit 947a629988f1 ("btrfs: move tree block parentness check into validate_extent_buffer()")
@ 2022-12-28 23:32 Qu Wenruo
  2022-12-28 23:32 ` [PATCH 1/2] btrfs: add error message for metadata level mismatch Qu Wenruo
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Qu Wenruo @ 2022-12-28 23:32 UTC (permalink / raw)
  To: linux-btrfs

There is a regression report from Mikhail that his btrfs RAID0 fs
randomly flips to RO after commit 947a629988f1.

It turns out that, the offending commit can not handle tree blocks
crossing stripe boundary.

Although tree blocks crossing stripe boundary is not an ideal situation,
we should still be able to correctly handle it.

This patchset firstly adds the missing level mismatch error message,
then fix the offending commit.

Qu Wenruo (2):
  btrfs: add error message for metadata level mismatch
  btrfs: fix the false alert on bad tree level

 fs/btrfs/disk-io.c   |  4 ++++
 fs/btrfs/extent_io.c | 31 ++++++++++++++++++++++++++-----
 2 files changed, 30 insertions(+), 5 deletions(-)

-- 
2.39.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] btrfs: add error message for metadata level mismatch
  2022-12-28 23:32 [PATCH 0/2] btrfs: fixes for commit 947a629988f1 ("btrfs: move tree block parentness check into validate_extent_buffer()") Qu Wenruo
@ 2022-12-28 23:32 ` Qu Wenruo
  2022-12-28 23:32 ` [PATCH 2/2] btrfs: fix the false alert on bad tree level Qu Wenruo
  2023-01-02 15:13 ` [PATCH 0/2] btrfs: fixes for commit 947a629988f1 ("btrfs: move tree block parentness check into validate_extent_buffer()") David Sterba
  2 siblings, 0 replies; 5+ messages in thread
From: Qu Wenruo @ 2022-12-28 23:32 UTC (permalink / raw)
  To: linux-btrfs

From a recent regression report, we found that after commit 947a629988f1
("btrfs: move tree block parentness check into
validate_extent_buffer()") if we have a level mismatch (false alert
though), there is no error message at all.

This makes later debug much harder.

This patch will add the proper error message for such case.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/disk-io.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index f8b5955f003f..ec48adba2ae9 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -530,6 +530,10 @@ static int validate_extent_buffer(struct extent_buffer *eb,
 	}
 
 	if (found_level != check->level) {
+		btrfs_err(fs_info,
+	"level verify failed on logical %llu mirror %u wanted %u found %u",
+			  eb->start, eb->read_mirror, check->level,
+			  found_level);
 		ret = -EIO;
 		goto out;
 	}
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] btrfs: fix the false alert on bad tree level
  2022-12-28 23:32 [PATCH 0/2] btrfs: fixes for commit 947a629988f1 ("btrfs: move tree block parentness check into validate_extent_buffer()") Qu Wenruo
  2022-12-28 23:32 ` [PATCH 1/2] btrfs: add error message for metadata level mismatch Qu Wenruo
@ 2022-12-28 23:32 ` Qu Wenruo
  2023-01-02 15:14   ` David Sterba
  2023-01-02 15:13 ` [PATCH 0/2] btrfs: fixes for commit 947a629988f1 ("btrfs: move tree block parentness check into validate_extent_buffer()") David Sterba
  2 siblings, 1 reply; 5+ messages in thread
From: Qu Wenruo @ 2022-12-28 23:32 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Mikhail Gavrilov

[BUG]
There is a bug report that on a RAID0 nvme btrfs system, under heavy
write load the fs can flip RO randomly.

With extra debug, it shows some tree blocks failed to pass its level
checks, and if that happens at critical path of a transaction, we abort
the transacation:

 BTRFS error (device nvme0n1p3): level verify failed on logical 5446121209856 mirror 1 wanted 0 found 1
 BTRFS error (device nvme0n1p3: state A): Transaction aborted (error -5)
 BTRFS: error (device nvme0n1p3: state A) in btrfs_finish_ordered_io:3343: errno=-5 IO failure
 BTRFS info (device nvme0n1p3: state EA): forced readonly

[CAUSE]
The reporter has already bisected to commit 947a629988f1 ("btrfs: move
tree block parentness check into validate_extent_buffer()").

And with extra debug, it shows we can have btrfs_tree_parent_check
filled with all zero in the following call trace:

  <TASK>
  submit_one_bio+0xd4/0xe0
  submit_extent_page+0x142/0x550
  read_extent_buffer_pages+0x584/0x9c0
  ? __pfx_end_bio_extent_readpage+0x10/0x10
  ? folio_unlock+0x1d/0x50
  btrfs_read_extent_buffer+0x98/0x150
  read_tree_block+0x43/0xa0
  read_block_for_search+0x266/0x370
  btrfs_search_slot+0x351/0xd30
  ? lock_is_held_type+0xe8/0x140
  btrfs_lookup_csum+0x63/0x150
  btrfs_csum_file_blocks+0x197/0x6c0
  ? sched_clock_cpu+0x9f/0xc0
  ? lock_release+0x14b/0x440
  ? _raw_read_unlock+0x29/0x50
  btrfs_finish_ordered_io+0x441/0x860
  btrfs_work_helper+0xfe/0x400
  ? lock_is_held_type+0xe8/0x140
  process_one_work+0x294/0x5b0
  worker_thread+0x4f/0x3a0
  ? __pfx_worker_thread+0x10/0x10
  kthread+0xf5/0x120
  ? __pfx_kthread+0x10/0x10
  ret_from_fork+0x2c/0x50
  </TASK>

Currently we only copy the btrfs_tree_parent_check structure into bbio
at read_extent_buffer_pages() after we have assembled the bbio.

But as shown in the above call trace, submit_extent_page() itself can
already submit the bbio, leaving the bbio->parent_check uninitialized,
and cause the false alert.

[FIX]
Instead of copying @check into bbio after bbio is assembled, we pass
@check in btrfs_bio_ctrl::parent_check, and copy the content of
parent_check in submit_one_bio() for metadata read.

By this, we should be able to pass the needed info for metadata endio
verification, and fix the false alert.

Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Link: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Fixes: 947a629988f1 ("btrfs: move tree block parentness check into validate_extent_buffer()")
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/extent_io.c | 31 ++++++++++++++++++++++++++-----
 1 file changed, 26 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 83dd3aa59663..b11332482d57 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -103,6 +103,16 @@ struct btrfs_bio_ctrl {
 	u32 len_to_oe_boundary;
 	btrfs_bio_end_io_t end_io_func;
 
+	/*
+	 * This is for metadata read, to provide the extra needed
+	 * verification info.
+	 * This has to be provided for submit_one_bio(), as submit_one_bio()
+	 * can submit a bio if it ends at stripe boundary.
+	 * If no such parent_check provided, the metadata can hit false alert
+	 * at endio time.
+	 */
+	struct btrfs_tree_parent_check *parent_check;
+
 	/*
 	 * Tell writepage not to lock the state bits for this range, it still
 	 * does the unlocking.
@@ -133,13 +143,24 @@ static void submit_one_bio(struct btrfs_bio_ctrl *bio_ctrl)
 
 	btrfs_bio(bio)->file_offset = page_offset(bv->bv_page) + bv->bv_offset;
 
-	if (!is_data_inode(&inode->vfs_inode))
+	if (!is_data_inode(&inode->vfs_inode)) {
+		if (btrfs_op(bio) != BTRFS_MAP_WRITE) {
+			/*
+			 * For metadata read, we should have the parent_check,
+			 * and copy it to bbio for metadata verification.
+			 */
+			ASSERT(bio_ctrl->parent_check);
+			memcpy(&btrfs_bio(bio)->parent_check,
+			       bio_ctrl->parent_check,
+			       sizeof(struct btrfs_tree_parent_check));
+		}
 		btrfs_submit_metadata_bio(inode, bio, mirror_num);
-	else if (btrfs_op(bio) == BTRFS_MAP_WRITE)
+	} else if (btrfs_op(bio) == BTRFS_MAP_WRITE) {
 		btrfs_submit_data_write_bio(inode, bio, mirror_num);
-	else
+	} else {
 		btrfs_submit_data_read_bio(inode, bio, mirror_num,
 					   bio_ctrl->compress_type);
+	}
 
 	/* The bio is owned by the end_io handler now */
 	bio_ctrl->bio = NULL;
@@ -4829,6 +4850,7 @@ static int read_extent_buffer_subpage(struct extent_buffer *eb, int wait,
 	struct extent_state *cached_state = NULL;
 	struct btrfs_bio_ctrl bio_ctrl = {
 		.mirror_num = mirror_num,
+		.parent_check = check,
 	};
 	int ret = 0;
 
@@ -4878,7 +4900,6 @@ static int read_extent_buffer_subpage(struct extent_buffer *eb, int wait,
 		 */
 		atomic_dec(&eb->io_pages);
 	}
-	memcpy(&btrfs_bio(bio_ctrl.bio)->parent_check, check, sizeof(*check));
 	submit_one_bio(&bio_ctrl);
 	if (ret || wait != WAIT_COMPLETE) {
 		free_extent_state(cached_state);
@@ -4905,6 +4926,7 @@ int read_extent_buffer_pages(struct extent_buffer *eb, int wait, int mirror_num,
 	unsigned long num_reads = 0;
 	struct btrfs_bio_ctrl bio_ctrl = {
 		.mirror_num = mirror_num,
+		.parent_check = check,
 	};
 
 	if (test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags))
@@ -4996,7 +5018,6 @@ int read_extent_buffer_pages(struct extent_buffer *eb, int wait, int mirror_num,
 		}
 	}
 
-	memcpy(&btrfs_bio(bio_ctrl.bio)->parent_check, check, sizeof(*check));
 	submit_one_bio(&bio_ctrl);
 
 	if (ret || wait != WAIT_COMPLETE)
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/2] btrfs: fixes for commit 947a629988f1 ("btrfs: move tree block parentness check into validate_extent_buffer()")
  2022-12-28 23:32 [PATCH 0/2] btrfs: fixes for commit 947a629988f1 ("btrfs: move tree block parentness check into validate_extent_buffer()") Qu Wenruo
  2022-12-28 23:32 ` [PATCH 1/2] btrfs: add error message for metadata level mismatch Qu Wenruo
  2022-12-28 23:32 ` [PATCH 2/2] btrfs: fix the false alert on bad tree level Qu Wenruo
@ 2023-01-02 15:13 ` David Sterba
  2 siblings, 0 replies; 5+ messages in thread
From: David Sterba @ 2023-01-02 15:13 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Thu, Dec 29, 2022 at 07:32:22AM +0800, Qu Wenruo wrote:
> There is a regression report from Mikhail that his btrfs RAID0 fs
> randomly flips to RO after commit 947a629988f1.
> 
> It turns out that, the offending commit can not handle tree blocks
> crossing stripe boundary.
> 
> Although tree blocks crossing stripe boundary is not an ideal situation,
> we should still be able to correctly handle it.
> 
> This patchset firstly adds the missing level mismatch error message,
> then fix the offending commit.
> 
> Qu Wenruo (2):
>   btrfs: add error message for metadata level mismatch
>   btrfs: fix the false alert on bad tree level

Thanks for tracking it down, patches added to misc-next.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] btrfs: fix the false alert on bad tree level
  2022-12-28 23:32 ` [PATCH 2/2] btrfs: fix the false alert on bad tree level Qu Wenruo
@ 2023-01-02 15:14   ` David Sterba
  0 siblings, 0 replies; 5+ messages in thread
From: David Sterba @ 2023-01-02 15:14 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, Mikhail Gavrilov

On Thu, Dec 29, 2022 at 07:32:24AM +0800, Qu Wenruo wrote:
> 
> By this, we should be able to pass the needed info for metadata endio
> verification, and fix the false alert.
> 
> Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
> Link: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>

This is probably a copy&paste mistake, I've replaced it with the mail
thread link.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-01-02 15:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-28 23:32 [PATCH 0/2] btrfs: fixes for commit 947a629988f1 ("btrfs: move tree block parentness check into validate_extent_buffer()") Qu Wenruo
2022-12-28 23:32 ` [PATCH 1/2] btrfs: add error message for metadata level mismatch Qu Wenruo
2022-12-28 23:32 ` [PATCH 2/2] btrfs: fix the false alert on bad tree level Qu Wenruo
2023-01-02 15:14   ` David Sterba
2023-01-02 15:13 ` [PATCH 0/2] btrfs: fixes for commit 947a629988f1 ("btrfs: move tree block parentness check into validate_extent_buffer()") David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).