From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Tavian Barnes <tavianator@tavianator.com>
Cc: "linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: About the weird tree block corruption
Date: Sat, 16 Mar 2024 06:21:58 +1030 [thread overview]
Message-ID: <49c7a4c8-852c-4b9c-ba57-938a097aaa6a@gmx.com> (raw)
In-Reply-To: <CABg4E-nKSZR4kvAGfxKLwAoH1_fJXwQb91spFAMsU9L1vqEpiA@mail.gmail.com>
在 2024/3/16 01:53, Tavian Barnes 写道:
> On Wed, Mar 13, 2024 at 2:07 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>> [SNIP]
>>
>> The second patch is to making tree-checker to BUG_ON() when something
>> went wrong.
>> This patch should only be applied if you can reliably reproduce it
>> inside a VM.
>
> Okay, I have finally reproduced this in a VM. I had to add this hunk
> to your patch 0002 in order to trigger the BUG_ON:
>
> diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
> index c8fbcae4e88e..4ee7a717642a 100644
> --- a/fs/btrfs/tree-checker.c
> +++ b/fs/btrfs/tree-checker.c
> @@ -2047,6 +2051,7 @@ int btrfs_check_eb_owner(const struct
> extent_buffer *eb, u64 root_owner)
> btrfs_header_level(eb) == 0 ? "leaf" : "node",
> root_owner, btrfs_header_bytenr(eb), eb_owner,
> root_owner);
> + BUG_ON(1);
> return -EUCLEAN;
> }
> return 0;
> @@ -2062,6 +2067,7 @@ int btrfs_check_eb_owner(const struct
> extent_buffer *eb, u64 root_owner)
> btrfs_header_level(eb) == 0 ? "leaf" : "node",
> root_owner, btrfs_header_bytenr(eb), eb_owner,
> BTRFS_FIRST_FREE_OBJECTID, BTRFS_LAST_FREE_OBJECTID);
> + BUG_ON(1);
> return -EUCLEAN;
> }
> return 0;
>
>> When using the 2nd patch, it's strongly recommended to enable the
>> following sysctl:
>>
>> kernel.ftrace_dump_on_oops = 1
>> kernel.panic = 5
>> kernel.panic_on_oops = 1
>
> I also set kernel.oops_all_cpu_backtrace = 1, and ran with nowatchdog,
> otherwise I got watchdog backtraces (due to slow console) interspersed
> with the traces which was hard to read.
oops_all_cpu_backtrace looks a little overkilled, and it seems to flood
the output.
>
>> And you need a way to reliably access the VM (either netconsole or a
>> serial console setup).
>> In that case, you would got all the ftrace buffer to be dumped into the
>> netconsole/serial console.
>>
>> This has the extra benefit of reducing the noise. But really needs a
>> reliable VM setup and can be a little tricky to setup.
>
> I got this to work, the console logs are attached. I added
>
> echo 1 > $tracefs/buffer_size_kb
>
> otherwise it tried to dump 48MiB over the serial console which I
> didn't have the patience for. Hopefully that's a big enough buffer, I
> can re-run it if you need more logs.
That's totally fine, and that's exactly what I do during debugging.
>
>> Feel free to ask for any extra help to setup the environment, as you're
>> our last hope to properly pin down the bug.
>
> Hopefully this trace helps you debug this. Let me know whenever you
> have something else for me to test.
The btrfs_crit() line is using btrfs_header_bytenr(), which can be
corrupted.
So it's much better to add extra trace_printk() to print eb->start so
that we can match the output.
But there is some interesting output, the trace_printk() in
btrfs_release_extent_buffer_pages() are already showing the page refs is
already 0, and its contents is already incorrect.
It may be a clue, but without the proper matching trace, it's still hard
to say.
I'm afraid you will need to add the extra trace_printk() lines, much
like this to all the return -EUCLEAN locations:
trace_printk("eb=%llu\n", eb->start);
>
> I can also try to send you the VM, but I'm not sure how to package it
> up exactly. It has two (emulated) NVMEs with LUKS and BTRFS raid0 on
> top.
>
Just send me the rootfs qcow2 would be more than enough.
I can setup LUKS and btrfs all by myself here.
Thanks,
Qu
next prev parent reply other threads:[~2024-03-15 19:52 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-13 6:07 About the weird tree block corruption Qu Wenruo
2024-03-14 17:44 ` Tavian Barnes
2024-03-14 18:42 ` Tavian Barnes
2024-03-14 20:25 ` Tavian Barnes
2024-03-15 15:23 ` Tavian Barnes
2024-03-15 19:51 ` Qu Wenruo [this message]
2024-03-15 20:01 ` Tavian Barnes
2024-03-15 20:21 ` Qu Wenruo
2024-03-15 22:15 ` Tavian Barnes
2024-03-15 23:14 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49c7a4c8-852c-4b9c-ba57-938a097aaa6a@gmx.com \
--to=quwenruo.btrfs@gmx.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=tavianator@tavianator.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).